Ask HN: What's the hardest problem you've ever solved?

85 pointslevlandau10y ago120 comments

We've all toiled and eventually solved some problem. Would be awesome to see some of the toughest problems HN community members have solved. Could be from any domain though it's better if the problem is somewhat technical.

120 comments

onion2k10y ago

This is exceptionally hard to answer because for every problem I solve I tend to end up looking at my solution and thinking "That wasn't so hard. Why did it take me so long? Am I bad at this stuff?"

To answer the question though, I think probably writing a robust web scraper to search events listings and turn them in to a sharable calendar. It'd be trivial these days but I did it in 1999 in Perl with regexs.

zallen10y ago

> "That wasn't so hard. Why did it take me so long? Am I bad at this stuff?"

Hah. Always. Hindsight bias and impostor syndrome are a fun mix! I remember writing a blog suite (with comments!) in Perl in the late 90s; back then, without S.O. and other knowledge-sharing beyond some Usenet forums, inventing the wheels as we went along... it was all hard.

stevoski10y ago

I coded the same thing (actually an online magazine with comments on articles, a forum, and a form for signing up for email updates), around the same time. I used classic ASP. I shudder to think of all the security holes I must have had.

vatys10y ago

Debugging a "hot CPU" boot failure issue. A custom motherboard design would only boot when the CPU was cold, like ICE cold (put it in a freezer, or hit it with cold spray).

Turns out the bias current node (external RBIAS resistor sets bias current) for PCIe was routed too close to an inductor for a power rail. When the CPU was warm, the power rail pulled more current, causing the inductor to ring more, causing the crosstalk on the bias net to screw up the PCIe subsystem and hang the CPU.

Found the issue accidentally on a layout change. Had to prove it by drilling out the via and re-routing the signal with wire.

paraschopra10y ago

When I was in initial years of programming my first hardest problem was implementing backpropagation in VB 6.0 http://paraschopra.com/tutorials/nn/ (2003)

Then the next hardest problem I found was implementing Genetic Programming in Python (year 2005) http://paraschopra.com/sourcecode/GP/index.php

It was fun but extremely hard for me (at that age!).

After that in 2008, I think the trickiest part for me was to write initial visual editor for Visual Website Optimizer. It involved reverse proxy and inserting JavaScript code into that reverse proxied page, letting the user visually edit the page contents.

Fun days. These days I hardly get to code, though last year I gifted my wife a website (http://wowsig.com) which was super fun.

doiwin10y ago

Turning myself from a data driven nerd into an emphatic person who understands social interactions. Finally, I bring home girls :)

hueving10y ago

You can be emphatic and remain empirical. It's the difference between thinking nobody is wrong vs knowing someone is wrong but it not mattering under certain contexts.

doiwin10y ago

Yes, that is part of it. I pretty much abandoned the thought of someone being wrong and someone being right. We all have this bias, that we are more intelligent then others. While on average, we are just average.

1 more reply

anotherevan10y ago

“You keep using that word. I do not think it means what you think it means.”

1 more reply

pbiggar10y ago

Hardest technical problem would have to be my PhD. I worked out how to apply alias analysis to PHP (previously people had figured out how to do it for the easy subset of PHP - I extended it to the entire language of PHP 5.3 or so). [1]

Along the way I'm pretty sure I also figured out how to build SSA form such that you have your alias analysis results available to be used at SSA construction and therefore redundant computation [2]. I never got to chase that down but it was really interesting.

[1] http://paulbiggar.com/research/#phd-dissertation, esp chapter 6.

[2] http://paulbiggar.com/research/#fit-2009

kzisme10y ago

Interesting read! Thanks!

_chendo_10y ago

Going to list a few interesting problems that were easy to solve but the identification of the issue was hard.

* A test suite we wrote for a client's project before a massive refactor was stalling randomly, but would continue when you tried to diagnose the problem. Turns out their user creation code used /dev/random, and the system was running out of entropy and so the code was blocking. Moving the mouse or typing on the keyboard would add entropy, thus cause the tests to resume. Fix was to to use /dev/urandom for tests.

* Found a weird issue with an embedded network stack where a limited broadcast packets to more than 3 devices would result in only a response from a few of them, but directed packets to each device would work fine. Devices reported successfully receiving and transmitting when monitored over serial console. Issue turned out to be a bug in the ARP implementation where it would incorrectly store any ARP response it saw (rather than ARP responses the device requested). Given the embedded system has a limited ARP cache due to memory constraints, when multiple devices wanted to respond, they would all send ARP requests, and the responses would flush the ARP cache, so when the network stack wanted to send the response, it didn't know what MAC to use and just drop it on the floor. A workaround was to increase the ARP cache size.

TeMPOraL10y ago

> Turns out their user creation code used /dev/random, and the system was running out of entropy and so the code was blocking. Moving the mouse or typing on the keyboard would add entropy, thus cause the tests to resume.

Funny how this goes completely against the typical operant conditioning a user undergoes when working with computers. Usually if your software hangs up, you want to touch nothing and let it finish. But in this case it's actually additional user activity that's needed.

Tiksi10y ago

It seems perfectly in line with what I generally see. A hang up usually results in some desperately mashed combination of Esc, Space, and Enter, then clicking on absolutely everything, and finally mashing ctrl-alt-del in the hopes of something happening. The let it do its thing and wait crowd has always been on the higher end of the technical knowledge spectrum.

odabaxok10y ago

Debugging can be hard sometimes, too. Here is a Quora topic about it:

http://www.quora.com/Whats-the-hardest-bug-youve-debugged

My favourite answers:

Crash Bandicoot: http://www.quora.com/Whats-the-hardest-bug-youve-debugged/an...

Flash Player: http://www.quora.com/Whats-the-hardest-bug-youve-debugged/an...

500-miles email: http://www.ibiblio.org/harris/500milemail.html

abrookewood10y ago

500-miles email is a fantastic read - thoroughly recommended.

kabouseng10y ago

An embedded CPU would sometimes latch-up when going to sleep. This wasn't seen during development, only reported in the field with very little in the error report, just something like "It stops working, reboot it works again"

Eventually we found the internal JTAG pull up resistances would not be sufficient at certain voltages / temperatures. So it wasn't latchup in the end but the JTAG would halt the processor.

We only found it after days of testing in an oven cycling temperature while stimulating a coil (RF field) close to the device while varying the supply voltage to cause the condition.

All the while the client is not happy that his devices randomly stops working, so we were under quite a bit of pressure.

And of course we only started looking at the hardware after we spent quite a bit of time thinking it was a software bug somewhere.

rcaught10y ago

I remember a defining struggle in the late 90's, while I was in high school and teaching myself how to make webpages.

My problem was a lack of javascript logic firing and the answer was to wait for document load. Simple, I know, but I had nobody around me (physically) that could help me and explaining the problem to people in forums seemed impossibly abstract, primarily because I did not understand what the problem was. It was the context around my code that I had to fix, but I kept looking in the code itself.

That was probably one of my first big "ah ha!" moments and these moments are one of the reasons I still love programming. Tenacity, luck and skill became irrevocably connected that day.

I've solved many other problems over the years, far tougher than this one, but maybe never tougher for me in a relative scale. If I had never solved that problem, I sometimes believe my life path would have been totally different.

c2210y ago

Limited access to informational resources can greatly hinder solution time. Some of my hardest challenges were implementing content for a MUD with a sparsely documented custom scripting language. Even with input from other devs there was a lot of trial and error and hacky workarounds.

Nowadays most of the problems I face have been solved by someone else in a slightly different context and searching for/implementing existing solutions is almost trivial.

Joeri10y ago

One of my first professional challenges was porting a CAD viewer to flash. The hardest part was figuring out how to convert ellipse sections and AutoCAD bulge arcs (line + bulge factor) to quadratic bezier curves. That one took three weeks of figuring out the math (starting from near zero because i hadn't paid much attention during school). I only completed the task through sheer stubbornness, because there were whole days where I made no progress at all.

annnnd10y ago

Genuinly curious: I thought you can't represent ellipses with bezier curves, at least accurately... What did you do?

Joeri10y ago

Cut it into sections smaller than pi/2, then approximated. Looked close enough to the naked eye that nobody complained.

bitshaker10y ago

I am part of a team created a system that took the complexity of the human metabolism and reduced it down to a few vital statistics that are then used to create a individualized formula that tells people how to sustainably improve their metabolism and lose weight and gain muscle. The amazing part is that it works for everyone because it is custom tailored to them. The hard part here was 30 years of testing, thankfully not done by me, but by the CEO who happens to be a bodybuilder.

The formula is how much protein, carbs, and fat to eat and the appopriate exercise of 3 half hour workout sessions a week. No supplements or anything else. Just food and small amounts of exercise to stimulate hormone response. This is way more complex than some tracker or calorie counter. It takes into account insulin spikes, metabolic damage assessments, glycogen storage, and much more. The hard part here was integrating ~10 different disciplines in various sciences. Everyone had a piece of the puzzle, but we had to put it together.

That is then fed into an app that can then pick foods for you based on your formula that is then constantly refined based on your results. We took 1000 people through test runs tweaking our code to get it right. Now it works for everyone that we put on it and actually uses the system.

Our next challenge is the psychology and habit forming parts of the app we have built.

Oh and of course competing with well funded competitors in the space, but at least nobody can claim our results because they just track things instead of allow people to really plan for health.

Edit: Since you asked, it's called mPact (for metabolism impact) and the corporate site is at http://mPact.io

tomjen310y ago

For habit forming, look to beeminder - its interface is clunky and its termonolgy too geeky, but it is also the best way to make "some day" to day (some day I want to lose weight) and keep it up that I have found.

bitshaker10y ago

Thanks. That's one avenue we've looked at already. I have also looked at traditional gamification techniques and found them initially motivating for users, but then engagement falls off a cliff. This has less to do with the techniques I suspect and more to do with the fact that people eventually hit a goal weight and think they are suddenly "fixed" and can go back to what they were doing before. We actively work to encourage the better mindset of making a permanent change in lifestyle where our system then moves from educational and informative to simply a tool to continue to plan and keep on track.

thejerz10y ago

What is it called?

lqdc1310y ago

I think problems are hard when you are new to a domain. After you get some practice, nothing there is really that hard.

For me, the first hard thing was implementing this https://en.wikipedia.org/wiki/Dead-end_elimination#Generaliz...

Probably because it was the first algorithm I implemented with no reference implementation to look at.

The second hardest was a high performance proxy that can redirect to another proxy and can collect specific types of non-encrypted data.

johnbender10y ago

NP-completeness for automatic fence placement between two specified instructions in the presence of arbitrary goto statements. Reduction is from negation free 2-SAT to control flow graphs for real programs.

Didn't make it into my first paper, hopefully will end up in my thesis :)

JensRantil10y ago

I used to work at a VOIP provider where users started reporting choppy audio. After a week or two we nailed it down to customers that had "call recording" feature enabled. Essentially their calls were being recorded and streamed to an audio file to be accessed later through a web interface. After yet another week of investigations we noticed that disk IO was fairly high on machines that had big customers with call recording enabled for all their endpoints. We drilled the IO issues down to the WAV file format that has a header that needs to be updated for every write to accomodate for the updated length of the recording. This required a lot of disk seeks on spinning disks and unfortunately file flushing could not be disabled.. Switching to a RAW audio format that we post-processed after the call resolved the issue.

lunixbochs10y ago

Running OpenGL on mobile (OpenGL ES) devices. From inside an x86 emulator.

[1] https://github.com/lunixbochs/glshim

[2] https://youtu.be/8ibx-2ZBLVg?t=76

davidst10y ago

Built the head tracker for the Amazon Fire phone.

bshimmin10y ago

Was that challenging from a technical perspective, or from a "I can't quite believe I'm doing this" perspective?

davidst10y ago

It was challenging in every way imaginable. There was no existing algorithm that could deliver the accuracy and robustness we required. It had to run within the limited power budget of a phone. And it had to be done quickly before the hardware became uncompetitive.

At the time it was given to me it was a rough demo with no clear path forward to shipping. We had no metrics to tell how good it was, how good it had to be, or whether we were even making progress. We had no team of computer vision experts to work on core algorithms. We had no idea if the problem was solvable at any amount of power consumption. There were more than a few people within the company who thought it couldn't be done.

I want to be very clear about credit. I put this as the hardest thing I have ever done but I was only the manager in charge of the project. While I built the team and owned the problem, I did not write the code or design the algorithms. I had incredible people who did outstanding engineering work and researchers who advanced the boundaries of computer vision. It was a privilege to work with them and I am proud of them.

dave3141510y ago

I recently wrote a time series modeling algorithm. I tried some existing open-sources packages but none worked very well. I really just wanted to decompose the time-series into a set of linear trends merged together in a continuous way. It turned out there was an elegant algorithm to do this called L1TF from Boyd's convex optimization group at Stanford. Also found a python implementation on Github to get started with. The paper mentioned that it was easy to add all kinds of things such as seasonality, discontinuities, outlier rejection, auto-regression etc but didn't give formulas. Just waved the hand like many academic papers do. I ended up figuring out how to add all these things but in order to do so, I had to learn a large part of the field of convex optimization in my after-work and vacation time and perform some lengthy, difficult calculus to arrive at the formulas. The algorithm worked great in the end. I find it funny that while the client is satisfied, they no idea that they now possess one of the world most powerful time-series algorithms which involves ideas from some of history's greatest mathematicians: Newton, Lagrange, Euler, von Neumann as well as many of the past century's luminaries. Open source part is here. https://github.com/dave31415/myl1tf

kschua10y ago

Poltergeist Room problem.

Back in the CRT monitor days, I was working for a computer repair company. There was this particular client (in the defence industry) who had monitors that started flickering and having a greenish hue at its sides after a week.

Every week, we had to go to his office to swap the monitors and bring the faulty ones back to recalibrate (it was costly, but hey, its a Defence contract and those pay big bucks)

It didn't matter whether the monitor was brand new or recalibrate ones, it just started flickering and had greenish hue after a week, and it only happened in that room. Other monitors outside that room and in other levels were fine, thus the room was dubbed the Poltergeist Room (as they blamed spirits for messing with it).

One day after the monitor exchange, I returned to the office and my supervisor queried me as to why I didn't reply to his multiple pages (we were using pagers back then). I realised I was in the Poltergeist Room when the pages were sent and therefore did not receive any page. It then dawned on me, "Could it be some electro magnetic interference from another level directly above or below that was playing havoc".

I went back to the client the next day to tell him what I thought and he (being electronics trained) realised that above him was a defence lab carrying out EMF experiments, which could have caused the monitor problems. He got to work to build a simple Faraday cage to prevent EMF from getting to the monitor. Since then, the monitors worked perfectly.

kranner10y ago

No 'degauss' button on those CRTs? :)

netik10y ago

Trying to figure out how to scale and secure Twitter. From a dozen people in a room to 2800 when I left, it was a challenge every day.

kzisme10y ago

How did you like your Twitter experience?

lifthrasiir10y ago

This is not the most difficult bug I've ever encountered, but it is definitely one of the most interesting bugs.

I had encountered some seriously incorrect outputs from the application server. The output in question was a function of internal states and current time (rounded to hours, it was kind of "hourly" display). The application server was set to log many input/output pairs so I was able to identify non-trivial amount of such errors, but I was unable to determine the cause. Common causes like memory corruption, time zones (as the business logic heavily depended on the local time), NTP synchronization and even the interpreter bug were considered and then rejected. Finally, after two weeks or some, I tried to simulate the function with varying current time and fixed internal states, and surprisingly a portion (but not all) of output from the past matched to the observed output!

It turned out that glibc `localtime` can misbehave in the way that it ignores the local timezone when it was unable to read `/etc/localtime`, and the Linux box the server was in had some issue on reading that (I never had fully identified it, this read was probably the only disk I/O from that server anyway). In lieu of this finding I have exhaustively and posthumously inspected the past logs; it was determined that the gross error rate was in the order of 10^-4 (!), and the way `localtime` used meant that the error can only alter a portion of the output. Studying the glibc code revealed that setting `TZ` environment variable would disable the UTC fallback, so I did so and the error was gone.

Lesson: Learn your moving parts, even if you don't know them in advance.

ruirr10y ago

I had the luck to have had to solve to though problems. Like in a performance analysis in the major telecom operator in an African country, finding many flaws in the infra-structure, circular DNSes (do not ask), and to top if all, their Internet reseller selling them the double of the real bandwidth, and to top it all, them having bridging enabled to all the country due to a vendor telling them "put this line on the central router". Or when a colleague wanted to upgrade technologies, implementing filters at the CPEs after reading DOCSIS RFCs on a cable company, and seeing the infra-structure upload traffic diving to less than half. Or picking up the Linux department and even before reimplementing all the servers, optimising servers that went from 9x% CPU utilisation to 10%. Or (re)implementing the middleware for 2 cable Internet companies, in which one of them had some functionalities in Java and I reimplemented in C to see some operations that were done in 1h being done in 5 minutes.

sdrothrock10y ago

How to isolate and identify a human hand with the fingers spread and track it in real time on an iPad.

Edit: Whoops, I realized this was ambiguous. I was using an iPad camera to track it and displaying the result as well as using the detection to trigger a camera shutter.

echeese10y ago

When I was a teenager, I was doing some experimentation with 2d shadows in Flash. The first version was done by using a BitmapData by iterating over every pixel and lighting it if it was not obscured. This took ~15 seconds to compute.

I was happy with this, until a friend challenged me to make it realtime. I managed to re-implement the same thing by using the built-in vector drawing (and as a bonus, this also gave anti-aliasing) and managed to get this down to 15ms.

The third version was using the 3D acceleration, and managed to get 100 lights to render in realtime. Was pretty proud of myself and I wrote an article about it, which was cited a few times by different people.

andersthue10y ago

Shortly after I started my first consulting business back in 1998 one of our customers wanted us to upgrade their compaq server from one disk to a raid.

We started Friday after normal working hour by checking that the backup worked (it did) then proceeded to upgrade the server with a raid backplane and three new scsi disk, installed Windows NT, installed the backup software and started a restore while getting some takeaway.

The restore only took like 15 minutes - and to our horror we discovered that the previous IT admin had set it up to do an incremental backup on the same dat tape overwriting it every day!

Ok, no worry, we had not used the old disk, so we installed it and turned on the computer.... Nothing happened.... Strange, we removed the raid backplane, installed everything as it had been... Still nothing.

After 24+ hours working on the problem, including several hours talking to compaq support (best support ever!) we had to go home for some sleep. When I got back to the server room I fired up Norton Disk Editor and painfully figured out the MBR was all zeros on the disk, luckily the rest of the disk look like correct data!

Several hours later, just before sunday turned to monday I finally got an MBR written using NDE and NDD, booting the system and seeing everything was all right.

Monday we told the customer we had some problems and would do the upgrade another day (after we had taken multiple backups :)

kephra10y ago

I solved many challenging problems, e.g. I wrote a parser generator, that can generate itself, I wrote an UN/EDIFACT parser that was parsing the human readable UN standard to create a parser for a semantic translation. My Y2K PTFs run on every MVS and OS/360 system. I did a lot of machine learning in the last 10 years, e.g. optimizing maintenance of Siemens power plant turbines or quality control for injection molding machines.

But ... I'm taking the biggest challenge right now. I'm coding my Onyx database client idea for 3rd time. The hardest problem was to start o3db. I failed badly with Onyx 20 years ago by burnout holding over half a million lines of C++ in flow together with nearly 10k lines of my own 4gl, during the 3rd customer installation of Onyx. I was very shy of coding UI/UX afterwards, escaped deep into server stuff, machine learning - escaped as far away from user as possible.

So, my biggest challenge was to start Onyx again: A user facing UI/UX for common business database applications with its own fourth generation language. I've decided for Scheme as an intermediate language this time, and the prototype running well. I now have a non recursive Scheme interpreter and GUI running in browser, able to process the meta tables defining an application. Its still a long road to my vision. But to start a project again, I failed with a burnout 20 years ago, and to code it with actual technology, was the biggest personal challenge.

/join #o3db on freenode, if interested in a startup to create common business database clients for the web.

sergiotapia10y ago

Performance tweaking when I used to build Blackberry apps back in 2009. It sucked so much that it turned me off of Blackberry phones entirely, as a consumer and as a developer. Remember these were the days where Blackberry was -the- phone to use. BBPin was red-hot and the iphone was too expensive for 99% of people.

Tweak -> Compile -> build deployable package -> push to phone -> wait 6 minutes -> test on phone -> repeat....

AnimalMuppet10y ago

Function a() called function b(). When function b() returned, a local variable in function a() had changed from 0 to 1.

"Aha!" you say. "You're smashing the stack! Function b() is writing outside its stack frame."

But function b() was provably not doing that.

Function b() called msgrcv(), which has a very badly designed API. It takes a pointer to a structure, and a size parameter. The structure is supposed to be a type field (long), and then a buffer (array of char). The size parameter is supposed to be the size of the buffer, not the size of the structure. The original code that implemented this came from a contractor, and they made the very natural mistake of putting the size of the whole structure in the size field. This meant that an extra long was read from the message queue, and smashed the stack.

But that should mess up the stack from from function b(). How did it mess up a variable in function a()? Well, the compiler put that variable in a register, not on the stack. So when b() was called, it had to save off the registers it was going to use, so a()'s local variable wound up in b()'s stack frame.

It took me most of a month, off and on, to figure that out.

zallen10y ago

Answering this question feels like the hardest problem I've solved yet... ;) Because, I don't know: I've never really thought "this one! THIS is the hardest!" You just iterate and things get more and more challenging as you build skills. What seemed hard to a junior tech doesn't seem hard to me as a senior tech now. It's all just engineering. It is all just sitting down, reading manuals or prior art, getting familiar with protocols or fundamentals, and building maps in your head until you understand something. Then building proofs of concepts and outlines; then, applying a bunch of troubleshooting principles; repeat until problem is solved. I've written academic papers this way, I've built streaming servers off esoteric industrial process-control database APIs, I've done process visualizations, I put a model railway online (before that was an out of the box thing)... and it's all the same: use what other people did, understand it, and then build from there.

thaumaturgy10y ago

I have a few little trophies I go back to every once in a while when I'm feeling like a crappy programmer.

- I worked out, on pen and paper, sorting networks on my own a few years before the Wikipedia article on them existed. I was looking for shortcuts in a Quicksort implementation. I hadn't read Art of Computer Programming yet, which is probably the only other place I would've been likely to read about it. It hadn't been covered in any of the other programming literature that I was devouring at the time.

- I wrote a variable interpolator in COBOL. COBOL has no string operators or anything resembling a string data type. This one was tricky. I was working as a programmer/operator at a school district at the time and the central hub of their IT was a Unisys mainframe that ran COBOL and WFL. There weren't any punch cards anymore, but everything ran as if there were; for any given job to run, say, report cards, you had to go into the WFL job and edit a two-digit school code in half a dozen places, in "digital punch cards", which would then be fed one after the other into COBOL programs. This was error-prone and I wanted a way to define a couple of variables at the top of the job file and then have everything work after that.

- I worked for a BigCo that used Remedy for its internal support systems. There were some latent training issues in the internal support department and support requests kept getting modified by unknown people, which would cause the requests to get mishandled and would irritate various other departments. I found a way to sneak some code into the Remedy forms system and I cobbled together a very rudimentary communications protocol between several forms so that all changes to any form got logged to another form, along with the user's id. Remedy had no loop logic at the time. That actually made it to a Remedy developer's group mailing list once and I was a big fish in a very tiny little puddle for a day.

- I reverse-engineered portions of the .dbf format that FoxPro uses, and wrote software that could convert .dbf files into MySQL tables. The date format was tricky. It was an 8 byte field where the first four bytes were a little-endian integer of the Julian date (so Oct. 15, 1582 = 2299161), and the next four bytes were the little-endian milliseconds since midnight. This is not documented anywhere.

Those are some of my favorites anyway. 30 years of programming, there's been some fun stuff along the way.

rvalue10y ago

Writing an implementation for parallel Travelling Salesman Problem w/ B&B using MPI and getting some god damn speedup

Raed66710y ago

Is this open-sourced?

ex3ndr10y ago

Making android lists scroll smooth

danudey10y ago

Setting up an IPSec VPN from a Linux server to Amazon VPC and running data over it. There was a host of documentation on how to do similar things with the appropriate tools, but as always, it was document A with 40% of the puzzle, document B with a non-overlapping 30% of the puzzle, and document C with an overlapping 40% of the puzzle… at which point I realized that all three documents were using different approaches/conventions/etc.

Documentation for the tools available seemed to varyingly assume that you either a) understood IPSec well enough and only needed to know how to use this one tool, or b) knew everything you needed to know, minus a few hints on the syntax of individual files.

Eventually I got everything working, but performance was abysmal. Sometimes. Sometimes SSH sessions opened instantly. Sometimes they opened slowly but then worked fine afterwards. Some tools were awful and others worked okay.

Eventually I realized that the IPSec configuration set up two tunnels to Amazon, but only set up actual routing (defining endpoints) for one of them. Thus Amazon was load-balancing packets over both tunnels and my Linux implementation was dropping 50% of packets. For established TCP connections this was fine because we had basically zero latency to VPC so retransmits (for what we were doing) were almost free since they would be discovered when the next packet arrived successfully, but for SYN/ACK packets a drop would result in an annoying wait.

Unfortunately, the tools don't allow you to define redundant/overlapping routes, so I couldn't set up two tunnels; I had to just configure one tunnel and leave the other one down so AWS wouldn't try to send data over it, and then just hope that that endpoint didn't go down at an inopportune time before I'd either set up some kind of load balancing scenario on my internal network (internal BGP maybe? ugh!) or given up entirely on the project.

After weeks of working on this specific task (the VPN setup) and making literally zero progress some days, googling for literal hours with no useful results, and trying various permutations, when I got it working I felt like I was the only person on the planet who'd ever done this before, since I was pretty sure that no one on the internet had ever written about it at least.

Even though the project was ultimately scrapped, I still feel like I learned a lot, and maybe I should feel like it was wasted time, but it also felt like quite an achievement to succeed.

JensRantil10y ago

This is funny because I am basically about to try and diagnose a _very_ similar issue with a VPN tunnel between a Cisco ASA and AWS. I'm also seeing SYN/ACK being occasionally dropped and TCP connection states ending up in WAIT state.

studentrob10y ago

Mine is more a social solution than a tech one. Hope that counts here!

Years ago I came up with a simple equation for determining priority of software engineering bug fixes and small features:

Priority = (Benefit the feature provides to the product) / (Time to complete the feature)

where benefit is defined by the business side using any scale (say 1-100), and time to complete is defined by the assigned software engineer using any unit (perhaps man-hours). Regardless of what range the numbers fall in, 0 to 1 or 0 to 42, you end up with an ordered list of tasks which equally value business value alongside engineering time.

I came up with this while working at a medium sized company. I was frequently tasked with too many things to do. Despite tasks being organized in a Redmine-like tool, the implementation was still done in random order because nobody could define priority. This led to much miscommunication about what I was working on in the recent past and future. I used the equation to better communicate my activity and future plans with the business side. Given an ordered list of tasks from this equation, anyone could see clearly what was being worked on next.

The business side resisted attaching a numeric benefit to the features, presumably because that's hard. But it's equally hard to define the time to complete a software engineering task, and I eventually convinced them we needed to at least try to be scientific about both.

n.b.: I used this while working on a mature system. For a newer project or for tasks with more dependencies, it's probably still complicated to define priority. In the setting I was in, it worked great.

My boss's boss however thought it was condescending and nobody aside from myself ever made use of it. I hope to make use of it again one day, but after one bad experience with a medium sized company, I've stuck with smaller places where this is not as necessary.

stephenr10y ago

While I've had some weird technical problems (I've worn a number of hats across Network/System Admin, to both front and back end web Development) the hardest is always the non-technical issues.

A few years ago I was contracting for a company that had a Native American Casino as a client. They wanted to build a gamified app/site to engage their customers more.

The single hardest problem was trying to look at the situation from the players point of view. Gambling like this (slot machines) is inherently an illogical thing to do - they know they're never going to make back the money they put in, but they walk away with a smile night after night.

Trying to rationalise (so we could understand their goals and what they might want out of an app/site targeted at them as players) proved impossible for basically everyone on the team.

jamesdelaneyie10y ago

How did it pan out in the end? Did you talk to the people the app was targeted at?

stephenr10y ago

They ran a small user session with some 'rewards club' players, to get feedback on the MVP that we built.

It did go live eventually but I don't think it's taken off as they hoped.

someremains10y ago

Hard problems are great because once you solve them, you get to solve even more challenging things as a result. I (with a small and great group of people) build a lot of physical things that are meant to look deceptively simple through various means, mostly being the disappearance of artefact of support. This leads to a lot of great design, engineering, documentation, and procurement logistic challenges. Past: wrapping a building in custom made chain-mail and need it to a) fit like a glove, b) not fall off, and c) not cause us to go broke. Current: 18,000m2 (196k SF) of entirely custom, double curved aluminium panels. The unique part count is currently hovering around 1,000,000 distinct components that all need to end up on a piece of paper (the building industry is big, slow, and strange).

SugarfreeSA10y ago

I think that this is such a tough question to definitively answer because the measure of difficulty of a problem is all relative to the particular point in time.

I am currently working on my thesis in artificial intelligence which to me seems tough because I have never written a thesis before. However, at work, I am dealing with technical software engineering problems that will seem easy after I have solved them.

My first industry project involved creating a generic form builder which could ultimately be used as a survey tool to draw statistics from. This seemed extremely challenging at the time, but now that all of the design decisions have been made, and complexities solved, I could redo it pretty easily (even though we shouldn't recreate the wheel)

Good thought provoking question though! Thanks!

dvirsky10y ago

I was part of a team that designed and implemented a P2P UDP based video streaming protocol, that receives chunks of a stream simultaneously from up to hundreds of peers. It wasn't a "big problem" per se - I worked on seemingly harder problems before and since - but this one was really hard to get right, it was a very non deterministic beast that was extremely hard to test. I remember lots of times that I secretly felt like we would never get this thing working as well as we wanted, but in the end we did.

I left this company long ago but they appear to be going strong still. http://www.giraffic.com/ . I'm sure they improved on that work a lot since then.

phpnode10y ago

Out of interest, did you use skiplists to reassemble the streams in order?

dvirsky10y ago

Wow, it was so long ago, I don't remember the exact data structures, but I'm pretty sure it wasn't a skiplist.

If you've already asked, one cool part of that technology is that the order of received packets is not important to assemble the stream. Basically every 1 second of video is reassembled without importance to the order of packets received. You need N packets to assemble a "data frame", IIRC pending incomplete data frames were stored in a simple hash table, but honestly it was so long ago I don't remember.

kohanz10y ago

We had a dependency limitation where an SDK we relied on only had an x86 release while our software ran (necessarily) as x64. I was quite proud of myself when I wrote (in a relatively short period of time) an IPC-based (memory mapped files) solution to communicate between the two seamlessly (performance mattered, as we were doing real-time imaging). It felt like a problem that some of my co-workers would have just given up on and said "it's not possible". Might not have been the "toughest", but in terms of time/difficulty trade-off, it ranks up there. Perhaps it would be trivial for others.

Of course, the real solution would be to press the dependency provider to release an x64 version, but we were not a priority of theirs.

chubot10y ago

All the problems I've solved seem about equally hard, since I put my full effort into solving them :) I tend to go into new subfields where I don't have a background, so the hardness is probably just proportional to the length of time I spent on the project.

The problems I failed to solve seem are the ones that seem the hardest, of course. I tried to write a cluster manager / distributed OS by myself, starting almost from scratch, and that was too much. I spent upwards of 4 years on it, and had some success, but I'm starting to move on.

In particular, I learned that having a reasonable amount of security with reasonable amount of development effort in a distributed system is still an unsolved problem. It's basically a bottomless pit of work.

Libbum10y ago

Difficult to say this problem is solved yet - the jury is still out, but I've done a good deal of work on identifying what the mechanism of a defect in superconducting phase qubits may be.

TL;DR: Two level system defects are a 20 year old unidentified noise source that can be described by an oxygen spatially delocalising in an amorphous portion of the underlying circuit.

See http://dx.doi.org/10.1103/PhysRevLett.110.077002 and http://dx.doi.org/10.1088/1367-2630/17/2/023017.

evincarofautumn10y ago

Designing a usable static type system for Forth-like (concatenative) programming languages.

karterk10y ago

Implementing all-pairs similarity search on a few hundred million records. Naively approached, the complexity of this is O(N^2), so had to come up with novel ways to make it finish in a reasonable amount of time and with limited resources.

DanielRapp10y ago

I'm sitting with a similar problem right now! Got any pointers?

icpmacdo10y ago

I have been programming for a bit more than three years. Making a half decent app in Cordova for some classes was the hardest I have worked at problem solving stuff in programs. Looking back on it now the code is really, really bad.

mataug10y ago

Wow, Is cordova that bad ?

icpmacdo10y ago

Nope Cordova over all is not that bad, I was that bad of a programmer though haha.

I think React Native is super cool right now if its going to make nice UI easier and have good multiplatform support.

abhinai10y ago

Cordova is not bad, trying to program a mobile app in HTML5 is. (1) You have to build most of the UX interactions yourself (2) Performance is a bitch. You can spend months trying to optimize your code and it still sucks (3) Different versions of android have different levels of support for HTML5 api. In the end, you get to use the lowest common denominator. (4) Windows phone reloads your Javascript / HTML code every time someone starts the app giving an obvious "reload flicker".

Basically it is one of those unfortunate cases where the first weeks makes everything look really promising (single codebase and all) and it is only after hard work of several months that you realize that there is no way you are going to win this battle.

1 more reply

chazu10y ago

This is a pathetic answer, but for me its probably an NLP web endpoint I built. The task was to take a query and categorize it into one of several categories provided when the server was started. So, for example, if a user submitted the query "barbie doll", the endpoint would return the three most likely categories: "toys", "clothes" and let's say "office supplies".

The way I did this was by using NLTK to compare the hypernym paths of the words in the query against the hypernym paths of the category names. I wrapped it in a tiny flask app and it was surprisingly fast enough for an MVP.

tmaly10y ago

what became of this project? It sounds interesting.

RobBollons10y ago

I usually find architecture problems to be the hardest to solve. The hardest one I've had to deal with is taking a legacy web application ~3 million lines of code and giving it some form of architecture so the product can have a sustainable future. Some of the issues included inline CSS styles, Core logic written in linear Classic ASP, ASP Web Forms written in a linear fashion and so on. As you can guess, what made it hard was trying to solve these issues without breaking anything, this is an obvious example of why automated testing and code quality are so important.

groar10y ago

Clearly, when I think about the hardest thing I ever coded, I have the following story in mind.

Back in 2002 I was writing a floppy disk driver for the little OS we were writing with a friend. It turned out finding anything else than very sparse documentation was really hard, plus for some unknown reason the floppy drive behavior seemed to be of non-deterministic nature. Maybe the fact that I was 15 didn't help.

At some point, after many nights spent on debugging it, it just worked. I still don't know why. I never changed any line of the code after that moment, by fear of breaking it.

lordnacho10y ago

Hard to pin down one in particular:

- Anything where you're looking for a race condition. It tends to be hard to reproduce, and instrumentation can make it go away entirely, leaving you with a need to conjecture about what might be happening. Quite satisfying when you find it, but again because it's rare you don't know if you're really solved it.

- Built a cross-platform, cross-language messaging system for trading. Combined UDP and TCP, had detection of downed servers. A lot of fiddling with network stuff, performance optimization on all platforms, both VM and native.

FarhadG10y ago

Not the most difficult but one of the most interesting in my professional software career: I implemented a consistent eventing model between WebGL and DOM. As a contributor to Famous' 3D engine, I wanted to have a similar eventing system between WebGL and DOM elements (element.on('click', 'scroll', etc.). I decided to use a "picking" model and encode geometry IDs in base 255 (4+ billion IDs) into the color buffer and provide a consistent API for both the DOM and WebGL renderer.

danialtz10y ago

Turning 21 TB of images into 12 one page excel sheets with implication of deciding factor between few cancer drugs. And the coding part was the easier part than biological data reduction.

stevoski10y ago

Making a website you with functioning log-in and log-off.

This was the 90's. It was surprisingly hard to implement this in a workable, reliable, secure way, and no one in our company of 50 programmers had ever done such a thing before!

I recall being puzzled for way too long at how to prevent someone from coming to a browser that had just logged off from our web app, and clicking the back button a couple of times to be logged in again.

Now, of course, it is a common and easy-ish task.

randomsearch10y ago

A simulator gave different results depending on the length of the filename of the executable under test. Only when I ran it on two separate but identical machines with different host names (I used the hostname in the file path to keep a record of where things were run) did I discover the cause of the differences in results I was seeing, which was due to the way that syscalls were handled by the simulator.

declan10y ago

If you're talking only about coding (and not other life challenges), the hardest problem I've solved so far has been figuring out how to build https://recent.io/ with my co-founder. Recommendation engine, fetchers, iOS app, Android app, etc.

There's a very big difference between concept and working code. :)

bliti10y ago

In code: Naming things in a manner that makes the code readable. It's a constant challenge.

With cars: troubleshooting and fixing a ferrari 599 without the required factory diagnostic computer. You can't beat a multimeter and some elbow grease. It was a faulty flow meter.

In general: figuring out what to do with my life. It took me a bit but was worth the time. Now I can focus on doing that and just that.

luck8710y ago

A GPU-optimized bruteforce for TrueCrypt volume: https://github.com/lvaccaro/truecrack . I have extract the chiper algorithms and build a parallel version of them on Cuda tecnology. When I started the project in 2011, the gpu world was not so kindly as now.

toxicFork10y ago

Making a trainer for a game. For example: "infinite bullets". It would crash the game for weird reasons. I ended up patching many places of the executable to prevent the crash.

In the end I found out that I managed to write a crack for the game by accident :D Later on I inspected a crack from another team, it would patch the same regions!

krapp10y ago

Getting SDL's incomplete types to work inside of a std::vector of std::unique_ptrs and compile with Visual Studio's compiler, and building a basic (rudimentary, probably not awesome) entity-component system in C++ to work with them.

And yes, I know a lot of what I just typed will probably put real game programmers' teeth on edge.

zw12345610y ago

About 10 years ago I built a Scanning Tunneling Microscope. It took me 2 years to complete it and to get it working.

ollyfg10y ago

Wow, that sounds really cool! What parts did you use (how much was "from scratch")? Any blog post or article about this? I'd love to read more.

RogerL10y ago

Robustly tracking and localizing very small objects in high clutter environments using computer vision.

rshetty10y ago

Made Websockets working with reverse proxy(IIS) sitting in between the browser and a Golang app server.

neurotech110y ago

I used to repair EEG systems and we had about a dozen "noisy" units. The boards all tested fine, but were noisy when hooked up to measure brainwaves.

It turned out people were running the SLA battery completely flat repeatedly, and subtly "wearing out" the battery.

azeirah10y ago

Building our own OpenGL game engine in C++ to run on a banana PI, which has severely broken drivers. It was both non-fun and very fun. We could've just chosen to do it on an FPGA, but nooooo we had to pick an actual gpu ;__;

racl10110y ago

Setting up a web service that can transcode audio files (namely mp3) that are uploaded to a server using ffmpeg. In fact, I know I could still improve it but I haven't figured out all the things ffmpeg can do.

avmich10y ago

I liked experience writing Tomita LR parser generator in J - while learning both parsing and J :) . ~700 lines heavily commented code with tests... Of course now it doesn't seem all that hard.

imh10y ago

Finding the Green's function for a nasty differential-difference equation. Great stuff, but I feel sooo embarrassed thinking back on how I attacked this kind of problem back then.

golergka10y ago

Modifying a Unity game with in-house 2d framework to correctly process complex unicode strings and input and render LTR, far eastern and emoji characters.

brianwillis10y ago

This might not be the hardest bug I've tackled, but it certainly took the longest to solve.

It was the early days of SOAP, and I had been assigned the task of integrating my employer's software with a third party's, so that the applications could share data. This third party org was a wealthy, powerful mega-corporation; and my employer was, well, not. The third party produced a spec for the interface, expected us to follow it, and offered no help from there.

I built a solution. It worked on my machine. Solved the problem. All was right in the world.

I moved it to the test environment. It worked again. Demoed it for one of our customers, and everyone was pleased.

Deployed it to our first beta tester. One lonely employee working accounts receivable, tucked away in the corner of our customer's office.

It crashed.

I checked everything. I mean everything. There are still particulars of that little Windows 2000 workstation that I can describe vividly. Which programs were installed, which patches were installed, how Windows had been configured, how the firewall worked, I even got permission to install a packet analyzer. My employer only had a handful of customers, and the beta test machine was near our offices, so I was over there personally a lot over the following weeks.

We brought in the customer's network support people. They found nothing. They could see the packets leaving, and an error coming back, but couldn't offer more than that.

We brought in the best networking engineer in my company. He was stumped.

What really shook my confidence was knowing that competitors mine had gotten this interface working. This wasn't some half baked project that I could blame on someone else. Others had succeeded where I'd failed.

I practically had to walk across broken glass to get on the phone with the third party's development team, but with enough pestering I pulled it off.

The phone call involved me sitting at the beta test workstation and firing off a request so that they could view it hitting their servers live. The developer who I spoke with immediately spotted the problem.

You see, when you send a SOAP request, you send the date and time that you're making the request along with it. The clocks on the client and the server were too far out of sync, my requests appeared to be coming from the future, and so the server disregarded them with a blunt error. Interestingly, the workstation clocks at my company's office weren't too far out of sync, which is why it worked in one place and not another.

Stuff I learnt:

1. Third party interfaces require a point of contact at both organisations who can talk with one another. This is non-negotiable.

2. If you send an error message that reads "Error", you're a bad developer and should return your computer science degree to your university and demand a refund.

3. No matter how well written the spec is, something always gets left out.

4. Persistence maters more than anything.

fbomb10y ago

I found a rather elegant solution for the halting problem. I would share it here but it won't fit in the margins.

codezero10y ago

As an undergrad, I worked in a lab that had a satellite all-sky imager.

It had three CCD cameras with strip imagers that were combined into a single all-sky image every orbit.

I was given a FORTRAN codebase that dated back to the 70s (supporting functions) and was told to figure out the best way to pick the start and end of the orbit as far as image frames were concerned.

The pointing data was in satellite frame-of-reference quaternions [1], and the satellite orbited about the axis of the Sun-Earth line, approximately.

Approximately was the key. Since it wasn't at a perfect 90 degree angle, the CCD strips each crossed over the plane defined by the Sun-Earth line and the axis orthogonal to the Earth's orbit (I referred to it as "south") at an angle.

So, if you want to stitch together an image of the sky that looks continuous, but the orbit of the imager wobbles a bit, and different discontinuities show up every day, how do you do it?

The leading CCD could be entirely across the southern line when the other CCDs were just starting to cross it. This created a lot of problems with how you define a complete orbit that lacks discontinuities and makes intuitive sense so others can understand the code.

I decided to pick the point where the middle of the central camera crossed the plane as the frame of reference for the start/end point.

Ultimately, this project took me about three months, just to get used to the code base, the spatial coordinates and transformations needed to make sense of the data, and then to finally write the code.

The meaningful changes I made in the commit consisted of about three lines of code.

I found the commit message:

Fixed problem near seam of map where start and end of orbit meet. The orientation of camera 2 at the start of the orbit is now used to draw a reference great circle on the sky. Near this boundary pixels are tested individually to decide whether they are part of the current orbit and should be dropped in the skymap. Introduced torigin to keep track of the time origin for the lowres time map. This is added to the Fits header of the time map as keyword TORIGIN (used to be STIME). Times tfirstfrm and tlastfrm are assigned the time of the first and last frame, respectively, for which at least one pixel was dropped in the skymap. These are written into the main header of the skymap as keywords STIME and ETIME. Added extra extension to lowres maps containg nr of pixels contributing to each lowres bin

[1] https://en.wikipedia.org/wiki/Quaternions_and_spatial_rotati...

hitlin3710y ago

there are lot of hard bugs in embedded Linux development. But once solved, its not hard anymore :) For example, code porting for different architecture could be tricky in some places. Porting a device driver to your new chip could be tricky as well especially if your HW vendor isn't helpful.

CmonDev10y ago

To sum up: dealing with other people's non-open-source code. Various COM APIs for example.

recursive10y ago

Uploading my perl cgi scripts as text instead of binary. I solved it in only 4 hours.

pathintegral10y ago

Well, the obvious amswer would be "my wife". Except she probably disagrees.

pvaldes10y ago

The 'hello world problem' a.k.a. birth. Nothing compares with that.

interdrift10y ago

A special graph dissection algorithm. It took me 3 months. >.<

pathintegral10y ago

The obvious answer would be my wife. Except she would disagree.

Agentlien10y ago

Implementing a real-time GPU-based fluid simulation.

JoshTriplett10y ago

Depending on which kind of "hard" you mean. Some problems are straightforward but very involved to fix, while others are incredibly difficult to investigate but easy to fix once found.

My first significant contribution to FOSS was to port OpenOffice.org to work without the then-proprietary Java, so that it could go into Debian main (and other distros with similar requirements). At the time, OO.o took 8 hours to build, or 3 hours with the wonders of ccache, and I was hacking on the build system itself, so incremental builds were often broken. (And the first thing OO.o built was its own implementation of make.) So over the course of a month or so, I would hack on it, rebuild to see it get a bit further, and repeat until it finally built without error. The net result was dozens of patches submitted and merged into Debian and ooo-build, and the 1.1.0-2 changelog entry listed here, which made it all worth it: http://metadata.ftp-master.debian.org/changelogs/main/libr/l... ('The "Wohoo-we-are-going-to-main" release')

The most challenging problems were two different mysterious crashes in BITS (biosbits.org), a Python environment running at the firmware level. Because of the environment, a crash means a sudden unexplained reboot, with no diagnostic information.

First, I was trying to debug a crash in the initial CPU bringup code, which brought the CPU from 16-bit real mode to 32-bit mode. After extensive investigation, including assembly output of characters to the serial port to indicate how far the code got, and hand-comparison of disassembled code with the original, it finally turned out to be a bug in the GNU assembler, mis-assembling an expression with a forward-referenced symbol when in .intel_syntax mode. The forward reference ended up becoming an unresolved relocation (with a 0 placeholder) instead of the intended compile-time constant, resulting in a wild pointer. It was one of the rare instances where the bug really was in the toolchain, combined with an environment that makes debugging a challenge.

The other such bug, in the 64-bit version of the same environment, involved GCC compiling struct assignments into SSE instructions that assume aligned addresses, and GRUB not actually aligning its stack for SSE because it never actually used SSE itself and didn't happen to use struct assignments. Debugging that one involved a quick hack of a general-protection-fault handler that hex-dumped the bytes of code around the instruction pointer, searching for those bytes in the compiled code, and matching that back up with the disassembly and source code.

Most recently, I debugged a race condition in a build system, where disk image manipulation (done by syslinux and mtools) was failing to obtain an flock file lock. The kernel doesn't actually have any way to find out who holds the lock, so I ended up instrumenting the flock syscall to print the conflicting lock holder. Turns out that udev took a file lock on the loopback device as soon as it showed up.

weland10y ago

The hardest bug I ever tracked to date resulted from a combination of me being a n00b at the time and legitimately being hard. It was a stack thrashing bug on an RTOS that ran on a system without MMU. To make things a little worse, GCC support for that platform was still very early at the time, so GDB would occasionally become confused, and did not support watches; besides, everything had gotten big enough at the time that there was no way to compile the whole system with debug symbols and no optimizations; the image was stripped and optimized for size.

The bug wasn't easy to reproduce: all we saw was that, every once in a while, when queried over $wirelessprotocol, the system would begin answering with crap values (it was supposed to measure some physical quantities, and crap values = meaningless, as in negative active power and hundreds of kV on a mains line), and if you kept on pounding it, it would eventually start "acting funny" -- randomly toggling LEDs and handling commands that were never given in the first place -- before eventually crashing. The problem was very far removed from its core; at first, all I was debugging was "system begins answering with thrashed values after a while".

I was two days into it when a more experienced colleague (I was a junior developer at the time) stepped in to help me. We began suspecting a process was smashing another process' stack when, after removing module after module, the bug was still not clearly reproducible by a particular sequence of steps, but the behaviour it triggered became fairly uniform.

We decided a good way to test this assumption was to modify the context switching routine to dump the current top of the stack over a serial line; unfortunately, that introduced additional delays that prevented the bug from occurring, so it didn't help us. We figured, however, that the handler for $wirelessprotocol's query was in the process that smashed the other process' stack, so we modified that handler to send the top of the stack over wireless (this is where not having a MMU helped, ironically :-) ). The base of the other process' stack could be obtained by just tracing context switches.

Sure enough, if enough commands piled up, that process (which was running some pretty intensive stuff, including floating point operations, on a very resource-constrained system) would smash into the next one's stack, messing up its context's registers.

In retrospect, this wasn't necessarily a difficult bug per se: the concept is well-understood and the theory behind it is trivial. The biggest problem is that it challenges the fundamental way we debug programs: when the CPU starts doing crap, we assume we've instructed it to do crap, and it's (correctly!) following consistently bad instructions. In this case, the CPU ended up following random instructions.

pcvarmint10y ago

Answering your question.

jxs41u10y ago

The riemann hypothesis.

titzer10y ago

MichaelCrawford10y ago

All three of Octel's servers would become unresponsive for no apparent reason. Sometimes they crashed after a while but sometimes they would come back up after a while.

Late one night when no one else was there I ran "top" only to puzzle over that a bunch of identical command lines were consuming all the CPU:

    login -p Mkkuow....

I don't remember the exact username but this Mkkuow guy was trying to log into all the terminals on each box.

I dont clearly recall how I figured this out but it was the result of capacitive coupling - parasitic capacitance - between the transmit and receive rs232 wires. The OS would transmit "SunOS login:" then get garbage on the receive line. Then it would prompt for the password a few times, eventually to give up and transmit the login prompt again.

The actual username I saw is easy to figure out by graphing the ASCII voltage levels then considering how capacitance works.

The solution was to replace all the cables with a lower capacitance cable. Because that required all new connectors as well as my time to install them my manager Karen Coates required some convincing but in the end the new cable stopped the hangs.

MichaelCrawford10y ago

I once found a mask error - a design flaw - in an embedded chip, but I was unable to work around it. I had to tell my client, a primary defense contractor that they selected the wrong part and would have to redesign their boards then respin their prototypes.

rando28910y ago

I'm reading various I made company/site/device x able do y, which seems like it would be 100x cooler if it was free for everyone to reuse and learn from. Could y benefit science, government, medicine, or diverse software which doesn't lock users into a platform, respects their privacy and self-determination? If hacker news was free software, what other communities might have sprung up and made the world better?

MichaelCrawford10y ago

I was a phone hotline volunteer for the Suicide Prevention Service of Santa Cruz County, California.

Think about that the next time your code gets you down.

j / k navigate · click thread line to collapse

120 comments

onion2k10y ago

This is exceptionally hard to answer because for every problem I solve I tend to end up looking at my solution and thinking "That wasn't so hard. Why did it take me so long? Am I bad at this stuff?"

zallen10y ago

> "That wasn't so hard. Why did it take me so long? Am I bad at this stuff?"

stevoski10y ago

vatys10y ago

Debugging a "hot CPU" boot failure issue. A custom motherboard design would only boot when the CPU was cold, like ICE cold (put it in a freezer, or hit it with cold spray).

Found the issue accidentally on a layout change. Had to prove it by drilling out the via and re-routing the signal with wire.

paraschopra10y ago

When I was in initial years of programming my first hardest problem was implementing backpropagation in VB 6.0 http://paraschopra.com/tutorials/nn/ (2003)

Then the next hardest problem I found was implementing Genetic Programming in Python (year 2005) http://paraschopra.com/sourcecode/GP/index.php

It was fun but extremely hard for me (at that age!).

Fun days. These days I hardly get to code, though last year I gifted my wife a website (http://wowsig.com) which was super fun.

doiwin10y ago

Turning myself from a data driven nerd into an emphatic person who understands social interactions. Finally, I bring home girls :)

hueving10y ago

You can be emphatic and remain empirical. It's the difference between thinking nobody is wrong vs knowing someone is wrong but it not mattering under certain contexts.

doiwin10y ago

1 more reply

anotherevan10y ago

“You keep using that word. I do not think it means what you think it means.”

1 more reply

pbiggar10y ago

[1] http://paulbiggar.com/research/#phd-dissertation, esp chapter 6.

[2] http://paulbiggar.com/research/#fit-2009

kzisme10y ago

Interesting read! Thanks!

_chendo_10y ago

Going to list a few interesting problems that were easy to solve but the identification of the issue was hard.

TeMPOraL10y ago

Tiksi10y ago

odabaxok10y ago

Debugging can be hard sometimes, too. Here is a Quora topic about it:

http://www.quora.com/Whats-the-hardest-bug-youve-debugged

My favourite answers:

Crash Bandicoot: http://www.quora.com/Whats-the-hardest-bug-youve-debugged/an...

Flash Player: http://www.quora.com/Whats-the-hardest-bug-youve-debugged/an...

500-miles email: http://www.ibiblio.org/harris/500milemail.html

abrookewood10y ago

500-miles email is a fantastic read - thoroughly recommended.

kabouseng10y ago

Eventually we found the internal JTAG pull up resistances would not be sufficient at certain voltages / temperatures. So it wasn't latchup in the end but the JTAG would halt the processor.

We only found it after days of testing in an oven cycling temperature while stimulating a coil (RF field) close to the device while varying the supply voltage to cause the condition.

All the while the client is not happy that his devices randomly stops working, so we were under quite a bit of pressure.

And of course we only started looking at the hardware after we spent quite a bit of time thinking it was a software bug somewhere.

rcaught10y ago

I remember a defining struggle in the late 90's, while I was in high school and teaching myself how to make webpages.

That was probably one of my first big "ah ha!" moments and these moments are one of the reasons I still love programming. Tenacity, luck and skill became irrevocably connected that day.

c2210y ago

Nowadays most of the problems I face have been solved by someone else in a slightly different context and searching for/implementing existing solutions is almost trivial.

Joeri10y ago

annnnd10y ago

Genuinly curious: I thought you can't represent ellipses with bezier curves, at least accurately... What did you do?

Joeri10y ago

Cut it into sections smaller than pi/2, then approximated. Looked close enough to the naked eye that nobody complained.

bitshaker10y ago

Our next challenge is the psychology and habit forming parts of the app we have built.

Oh and of course competing with well funded competitors in the space, but at least nobody can claim our results because they just track things instead of allow people to really plan for health.

Edit: Since you asked, it's called mPact (for metabolism impact) and the corporate site is at http://mPact.io

tomjen310y ago

bitshaker10y ago

thejerz10y ago

What is it called?

lqdc1310y ago

I think problems are hard when you are new to a domain. After you get some practice, nothing there is really that hard.

For me, the first hard thing was implementing this https://en.wikipedia.org/wiki/Dead-end_elimination#Generaliz...

Probably because it was the first algorithm I implemented with no reference implementation to look at.

The second hardest was a high performance proxy that can redirect to another proxy and can collect specific types of non-encrypted data.

johnbender10y ago

Didn't make it into my first paper, hopefully will end up in my thesis :)

JensRantil10y ago

lunixbochs10y ago

Running OpenGL on mobile (OpenGL ES) devices. From inside an x86 emulator.

[1] https://github.com/lunixbochs/glshim

[2] https://youtu.be/8ibx-2ZBLVg?t=76

davidst10y ago

Built the head tracker for the Amazon Fire phone.

bshimmin10y ago

Was that challenging from a technical perspective, or from a "I can't quite believe I'm doing this" perspective?

davidst10y ago

dave3141510y ago

kschua10y ago

Poltergeist Room problem.

Every week, we had to go to his office to swap the monitors and bring the faulty ones back to recalibrate (it was costly, but hey, its a Defence contract and those pay big bucks)

kranner10y ago

No 'degauss' button on those CRTs? :)

netik10y ago

Trying to figure out how to scale and secure Twitter. From a dozen people in a room to 2800 when I left, it was a challenge every day.

kzisme10y ago

How did you like your Twitter experience?

lifthrasiir10y ago

This is not the most difficult bug I've ever encountered, but it is definitely one of the most interesting bugs.

Lesson: Learn your moving parts, even if you don't know them in advance.

ruirr10y ago

sdrothrock10y ago

How to isolate and identify a human hand with the fingers spread and track it in real time on an iPad.

Edit: Whoops, I realized this was ambiguous. I was using an iPad camera to track it and displaying the result as well as using the detection to trigger a camera shutter.

echeese10y ago

andersthue10y ago

Shortly after I started my first consulting business back in 1998 one of our customers wanted us to upgrade their compaq server from one disk to a raid.

The restore only took like 15 minutes - and to our horror we discovered that the previous IT admin had set it up to do an incremental backup on the same dat tape overwriting it every day!

Several hours later, just before sunday turned to monday I finally got an MBR written using NDE and NDD, booting the system and seeing everything was all right.

Monday we told the customer we had some problems and would do the upgrade another day (after we had taken multiple backups :)

kephra10y ago

/join #o3db on freenode, if interested in a startup to create common business database clients for the web.

sergiotapia10y ago

Tweak -> Compile -> build deployable package -> push to phone -> wait 6 minutes -> test on phone -> repeat....

AnimalMuppet10y ago

Function a() called function b(). When function b() returned, a local variable in function a() had changed from 0 to 1.

"Aha!" you say. "You're smashing the stack! Function b() is writing outside its stack frame."

But function b() was provably not doing that.

It took me most of a month, off and on, to figure that out.

zallen10y ago

thaumaturgy10y ago

I have a few little trophies I go back to every once in a while when I'm feeling like a crappy programmer.

Those are some of my favorites anyway. 30 years of programming, there's been some fun stuff along the way.

rvalue10y ago

Writing an implementation for parallel Travelling Salesman Problem w/ B&B using MPI and getting some god damn speedup

Raed66710y ago

Is this open-sourced?

ex3ndr10y ago

Making android lists scroll smooth

danudey10y ago

Even though the project was ultimately scrapped, I still feel like I learned a lot, and maybe I should feel like it was wasted time, but it also felt like quite an achievement to succeed.

JensRantil10y ago

studentrob10y ago

Mine is more a social solution than a tech one. Hope that counts here!

Years ago I came up with a simple equation for determining priority of software engineering bug fixes and small features:

Priority = (Benefit the feature provides to the product) / (Time to complete the feature)

stephenr10y ago

While I've had some weird technical problems (I've worn a number of hats across Network/System Admin, to both front and back end web Development) the hardest is always the non-technical issues.

A few years ago I was contracting for a company that had a Native American Casino as a client. They wanted to build a gamified app/site to engage their customers more.

Trying to rationalise (so we could understand their goals and what they might want out of an app/site targeted at them as players) proved impossible for basically everyone on the team.

jamesdelaneyie10y ago

How did it pan out in the end? Did you talk to the people the app was targeted at?

stephenr10y ago

They ran a small user session with some 'rewards club' players, to get feedback on the MVP that we built.

It did go live eventually but I don't think it's taken off as they hoped.

someremains10y ago

SugarfreeSA10y ago

I think that this is such a tough question to definitively answer because the measure of difficulty of a problem is all relative to the particular point in time.

Good thought provoking question though! Thanks!

dvirsky10y ago

I left this company long ago but they appear to be going strong still. http://www.giraffic.com/ . I'm sure they improved on that work a lot since then.

phpnode10y ago

Out of interest, did you use skiplists to reassemble the streams in order?

dvirsky10y ago

Wow, it was so long ago, I don't remember the exact data structures, but I'm pretty sure it wasn't a skiplist.

kohanz10y ago

Of course, the real solution would be to press the dependency provider to release an x64 version, but we were not a priority of theirs.

chubot10y ago

Libbum10y ago

Difficult to say this problem is solved yet - the jury is still out, but I've done a good deal of work on identifying what the mechanism of a defect in superconducting phase qubits may be.

TL;DR: Two level system defects are a 20 year old unidentified noise source that can be described by an oxygen spatially delocalising in an amorphous portion of the underlying circuit.

See http://dx.doi.org/10.1103/PhysRevLett.110.077002 and http://dx.doi.org/10.1088/1367-2630/17/2/023017.

evincarofautumn10y ago

Designing a usable static type system for Forth-like (concatenative) programming languages.

karterk10y ago

DanielRapp10y ago

I'm sitting with a similar problem right now! Got any pointers?

icpmacdo10y ago

mataug10y ago

Wow, Is cordova that bad ?

icpmacdo10y ago

Nope Cordova over all is not that bad, I was that bad of a programmer though haha.

I think React Native is super cool right now if its going to make nice UI easier and have good multiplatform support.

abhinai10y ago

1 more reply

chazu10y ago

tmaly10y ago

what became of this project? It sounds interesting.

RobBollons10y ago

groar10y ago

Clearly, when I think about the hardest thing I ever coded, I have the following story in mind.

At some point, after many nights spent on debugging it, it just worked. I still don't know why. I never changed any line of the code after that moment, by fear of breaking it.

lordnacho10y ago

Hard to pin down one in particular:

FarhadG10y ago

danialtz10y ago

Turning 21 TB of images into 12 one page excel sheets with implication of deciding factor between few cancer drugs. And the coding part was the easier part than biological data reduction.

stevoski10y ago

Making a website you with functioning log-in and log-off.

This was the 90's. It was surprisingly hard to implement this in a workable, reliable, secure way, and no one in our company of 50 programmers had ever done such a thing before!

Now, of course, it is a common and easy-ish task.

randomsearch10y ago

declan10y ago

There's a very big difference between concept and working code. :)

bliti10y ago

In code: Naming things in a manner that makes the code readable. It's a constant challenge.

With cars: troubleshooting and fixing a ferrari 599 without the required factory diagnostic computer. You can't beat a multimeter and some elbow grease. It was a faulty flow meter.

In general: figuring out what to do with my life. It took me a bit but was worth the time. Now I can focus on doing that and just that.

luck8710y ago

toxicFork10y ago

Making a trainer for a game. For example: "infinite bullets". It would crash the game for weird reasons. I ended up patching many places of the executable to prevent the crash.

In the end I found out that I managed to write a crack for the game by accident :D Later on I inspected a crack from another team, it would patch the same regions!

krapp10y ago

And yes, I know a lot of what I just typed will probably put real game programmers' teeth on edge.

zw12345610y ago

About 10 years ago I built a Scanning Tunneling Microscope. It took me 2 years to complete it and to get it working.

ollyfg10y ago

Wow, that sounds really cool! What parts did you use (how much was "from scratch")? Any blog post or article about this? I'd love to read more.

RogerL10y ago

Robustly tracking and localizing very small objects in high clutter environments using computer vision.

rshetty10y ago

Made Websockets working with reverse proxy(IIS) sitting in between the browser and a Golang app server.

neurotech110y ago

I used to repair EEG systems and we had about a dozen "noisy" units. The boards all tested fine, but were noisy when hooked up to measure brainwaves.

It turned out people were running the SLA battery completely flat repeatedly, and subtly "wearing out" the battery.

azeirah10y ago

racl10110y ago

avmich10y ago

I liked experience writing Tomita LR parser generator in J - while learning both parsing and J :) . ~700 lines heavily commented code with tests... Of course now it doesn't seem all that hard.

imh10y ago

Finding the Green's function for a nasty differential-difference equation. Great stuff, but I feel sooo embarrassed thinking back on how I attacked this kind of problem back then.

golergka10y ago

Modifying a Unity game with in-house 2d framework to correctly process complex unicode strings and input and render LTR, far eastern and emoji characters.

brianwillis10y ago

This might not be the hardest bug I've tackled, but it certainly took the longest to solve.

I built a solution. It worked on my machine. Solved the problem. All was right in the world.

I moved it to the test environment. It worked again. Demoed it for one of our customers, and everyone was pleased.

Deployed it to our first beta tester. One lonely employee working accounts receivable, tucked away in the corner of our customer's office.

It crashed.

We brought in the customer's network support people. They found nothing. They could see the packets leaving, and an error coming back, but couldn't offer more than that.

We brought in the best networking engineer in my company. He was stumped.

I practically had to walk across broken glass to get on the phone with the third party's development team, but with enough pestering I pulled it off.

Stuff I learnt:

1. Third party interfaces require a point of contact at both organisations who can talk with one another. This is non-negotiable.

2. If you send an error message that reads "Error", you're a bad developer and should return your computer science degree to your university and demand a refund.

3. No matter how well written the spec is, something always gets left out.

4. Persistence maters more than anything.

fbomb10y ago

I found a rather elegant solution for the halting problem. I would share it here but it won't fit in the margins.

codezero10y ago

As an undergrad, I worked in a lab that had a satellite all-sky imager.

It had three CCD cameras with strip imagers that were combined into a single all-sky image every orbit.

I was given a FORTRAN codebase that dated back to the 70s (supporting functions) and was told to figure out the best way to pick the start and end of the orbit as far as image frames were concerned.

The pointing data was in satellite frame-of-reference quaternions [1], and the satellite orbited about the axis of the Sun-Earth line, approximately.

So, if you want to stitch together an image of the sky that looks continuous, but the orbit of the imager wobbles a bit, and different discontinuities show up every day, how do you do it?

I decided to pick the point where the middle of the central camera crossed the plane as the frame of reference for the start/end point.

The meaningful changes I made in the commit consisted of about three lines of code.

I found the commit message:

[1] https://en.wikipedia.org/wiki/Quaternions_and_spatial_rotati...

hitlin3710y ago

CmonDev10y ago

To sum up: dealing with other people's non-open-source code. Various COM APIs for example.

recursive10y ago

Uploading my perl cgi scripts as text instead of binary. I solved it in only 4 hours.

pathintegral10y ago

Well, the obvious amswer would be "my wife". Except she probably disagrees.

pvaldes10y ago

The 'hello world problem' a.k.a. birth. Nothing compares with that.

interdrift10y ago

A special graph dissection algorithm. It took me 3 months. >.<

pathintegral10y ago

The obvious answer would be my wife. Except she would disagree.

Agentlien10y ago

Implementing a real-time GPU-based fluid simulation.

JoshTriplett10y ago

Depending on which kind of "hard" you mean. Some problems are straightforward but very involved to fix, while others are incredibly difficult to investigate but easy to fix once found.

weland10y ago

pcvarmint10y ago

Answering your question.

jxs41u10y ago

The riemann hypothesis.

titzer10y ago

MichaelCrawford10y ago

All three of Octel's servers would become unresponsive for no apparent reason. Sometimes they crashed after a while but sometimes they would come back up after a while.

Late one night when no one else was there I ran "top" only to puzzle over that a bunch of identical command lines were consuming all the CPU:

    login -p Mkkuow....

I don't remember the exact username but this Mkkuow guy was trying to log into all the terminals on each box.

The actual username I saw is easy to figure out by graphing the ASCII voltage levels then considering how capacitance works.

MichaelCrawford10y ago

rando28910y ago

MichaelCrawford10y ago

I was a phone hotline volunteer for the Suicide Prevention Service of Santa Cruz County, California.

Think about that the next time your code gets you down.

j / k navigate · click thread line to collapse