Ask HN: What useful internal tools or libraries have you built in your company?

59 pointscrack-the-code6y ago29 comments

I'm curious to know what kind of tools, scripts, automation, libraries, etc. you all have built to help boost the productivity of your team(s).

29 comments

z3ugma6y ago

At a company of 10,000, it's important to know the 100 people you'll be working closest to. I built a "memory" game as a webapp which matched the faces of 4 people on your team to a single name and a list of self-assigned skills. You click on a photo to match a name to a face, and once you guess right a new set loads. You can randomly click through your whole team and learn a lot about them in just 15 minutes or so.

The whole thing was built with read-only SQL scripts, Flask, and some JQuery.

suramya_tomar6y ago

This sounds very interesting and useful. I was thinking about making something similar but with my personal contacts & people I meet socially.

Is your code opensource or only for internal use?

sloaken6y ago

I agree, I think thats a great idea. I would like to put it on at my place. Mostly for me, as I can never remember people and its not that big of a company but there are so many people I do not know.

stephenbez6y ago

Nice. I've done a similar thing using spaced repetition software like Anki to practice everyone's name right after I join a company.

reacharavindh6y ago

At my current job, I saw a lab technician work manually with Excel sheets entering sample IDs and then using a website where he'd copy/paste the sample ID into to get a bar code, and then print it out to stick on the box.

I wrote a Python script that uses openpyxl module to read His Excel docs, and report lab module to generate bar codes in a PDF document with appropriate spacers so that he can simply print it out, and stick them in boxes.

He is happy and so am I that I could save his time. It only took me 20 mins to write this script.

thedevindevops6y ago

There genuinely needs to be more of this sort of thing.

The overall 'productionisation' of our industry has led us into a cookie cutter style of work and away from genuine problem solving like that. Ironically that sort of productivity boosting work has been wrapped up in a nonsense 'process automation consultant' role that is inflated beyond sense and often dismissed by the receiving company as an unnecessary expense.

digitalsushi6y ago

I wrote a shim layer for all our packer/vagrant OS workflows to operate against an unreliable vsphere ecosystem. It exposes a suite of posix sh functions for sysadmins/developers to easily operate against this very unreliable environment. It adds automatic logging, retrying, and adjustable verbosity because of the numerous ways this environment randomly fails.

People can just . source the file in from a shared location and often find that their scripts just start to work better. It's not perfect, nothing's perfect. It's not even that clever. But when builds and deploys start to work twice as good, even with the remaining failures, well, that's something. None of the 65000 employees using it will ever know, but it feels good to know we were dropping 2/3 orders and now we're dropping 1/3.

MediumD6y ago

Back at my old job, people would have trouble knowing what to do when on-call.

I built a slack app that would keep track of my team's pages and what people did to respond to them. As new pages were triggered, the bot would show the on-call person what previous people had done to resolve the page.

sethammons6y ago

Nifty. This should integrate with a service runbook.

i5h4n6y ago

In my previous organization, we dealt with a legacy enterprise software product which had accumulated a massive bug history over multiple years and sub-products. All being tracked by an in-house bug tracking product.

Lots of issues we used to see being reported were either already fixed or had been config issues. In order to (somewhat) quickly find existing fixes/comments for issues that we get reported, I built a search tool (webapp) which scraped the bugs and comments in those bugs in order to find any relevant information around your query and listed them in order of matching probability.

Was a pretty cool learning experience to build that out. I had deployed it on a personal remote VM that devs were granted, have no idea if people are still using it.

tehlike6y ago

I built a hacked up experimentation framework for clientside flags, that boosted my teams speed and confidence quite a bit. Hacked up because it didnt use existing serverside mechanism for a bunch of reasons.

Used the same experimentation framework for automated javascript binary releases, so at some point i could release 5 times a week, with no issues. Now i left the team, people took that on and continuing like tic toc.

Showed them how to use powerdrill (data drilling, analysis tool), and taught them metrics. It is surpising how little people care what their work is really about eventually, and bringing them data driven mindset gave even more productivity boost.

Adamantcheese6y ago

At my last job we had to do builds constantly and put it on hardware, which was annoying because builds took 15 minutes and putting it on the hardware took another 10. Couldn't solve the latter because it wasn't in our domain, but the first half I managed to "multithread" a build using a really hacky batch script compliation method, with a make file calling the compiler in a new command window for each file that needed to be compiled, with some checks for "needs to be compiled" or "wasn't changed". An extra script at the end of the process made sure that all the compiler instances finished before continuing with the next step. All of that work got it down to 2 minutes, or in small change cases, about 30 seconds. And another part of that was integrating some configuration data with existing files, which was simple as writing up a bunch of excel macros to do the copy/pasting and file output. It was hooked up to a shared folder on the network so the other team could just do their part, and then my part was entirely automated. In fact, the team testing things could do everything by themselves without any input from me at that point and only needed me to answer certain questions.

Yes, it's really hacky and the whole thing is entirely silly and could have been solved by using more proper tools (i.e. not a defunct make software without wildcard support for input files or Excel for configuration), but I was VERY pleased when I got it working.

actionowl6y ago

I was working on a project where we'd be printing several hundred thousand badges for several schools. We had all the data and just needed photos. The client sent us a DVD with several hundred thousand photos, upon inspection we realized that the photos where really bad:

- No single aspec ratio

- Some photos had no one in it (picture of a chair, etc)

- Some photos had multiple people in the photo (!?)

- Some photos were of such poor quality that you couldn't make out the person.

It seemed some locations let the students provide their own photo. This is the first time we'd ever encountered data in this shape.

My company had two options: Print the data as-is (which would result in thousands of reprints) or hire some temp staff to sort through the photos.

I asked them to let me try and sort them over the weekend with a library I just learned about (OpenCV). I was able to write a custom OpenCV python script a little over a hundred lines long and ran it over the weekend to crop and sort the photos into several categories (based on face detection) leaving only a few thousand that had to be manually reviewed! That had a real dollar impact and felt really good.

stevekemp6y ago

In the past week I've written a broken-link checker, in perl, to sanity-check the output of a static-site-generator.

I've also written a trivial PHP parser which was designed to match up class-definitions with comments above them:

https://blog.steve.fi/parsing_php_for_fun_and_profit.html

Both of these tools were designed to be invoked by CI/CD systems, to flag potential problems before they became live.

Most of my work involves scripting, or tooling, around existing systems and solutions. For example another developer-automation hack was to automatically add the `approved` label to pull-requests which had received successful reviews from all selected reviewers - on a self-hosted Github Enterprise installation.

rahulrrixe6y ago

I built a code generator package in Kotlin which generates codes for Kotlin, Swift, Web (JS), and React-Native (TypeScript). Basically, you provide your class definition in a DSL style (Similar to TOML) and it will generate the implementation and interfaces of the bridge for different technologies.

s66qnf926y ago

This is great! We're working on code generation from class definitions right now.

Any good resources worth looking at?

rahulrrixe6y ago

I started by checking how you can write HTML using Kotlin DSL. Here is the source code https://github.com/Kotlin/kotlinx.html

Now, I have to generate different languages once the DSL is finalized. To achieve this I use Flask framework architecture. There we have routes with HTML templates. Here each generator has its own templates.

ioddly6y ago

Not familiar with GPs use case but TypeScript has a glorious compiler API that could handle the heavy lifting here.

solumos6y ago

When our company was doing more active Go development, a colleague and I built Charlatan.

https://github.com/percolate/charlatan

Ended up saving us a lot of time writing mocks for tests.

cyanide9116y ago

Python 3+:

Blue - A dead simple event based workflow execution framework.

I always find it easier to model systems from an event driven perspective. Especially when you have to move fast and evolve unpredictably. I wanted a framework anyone could learn to use within 5-10 minutes. At the same time it should be able to solve all kinds of use cases that require event based coordination between tasks in a distributed environment.

Works well for us for simple use cases (eg. data processing workflows) and complex ones (eg. our entire retail order fulfilment system).

raihansaputra6y ago

Is this similar to Prefect? I'm interested on using those kind of system, I think it would be really easy to mock up business processes on these kind of tools quickly before building a more robust solution.

dhruvkar6y ago

I wrote a shipping container tracking system for ~7 shipping lines.

Each shipping line offers a tracking service through one of these methods -- email, RSS or website form. Our container numbers are collected into a Google Spreadsheet via our freight forwarders. Our employees use an antiquated ERP with no API.

The script collects relevant container numbers from the Google spreadsheet, scrapes the update and the scrapes the ERP system to enter the update.

Random_Person6y ago

I wrote a custom documentation tool that we use on all of our projects. It's a few input fields for heading/paragraph/images and a few buttons. You can add as many "sections" as you want. It exports HTML/CSS that you can stick in any <div> and it scales well, handles popups for images, and such. It's made our life much simpler when adding documentation to our sites.

shanecleveland6y ago

Automated discovery of late shipments eligible for a refund, which the carriers otherwise make very difficult to track. There are some services that can do this, but they take a big chunk of the refunds. We save thousands each year.

Many other specialized calculators and templates, which tend to be more foolproof than Excel.

schappim6y ago

I did the same in Australia. When I ran the script to request refunds for missed Express Post SLAs, we were 90% of Auspost’s inquiries for the day and got back thousands.

shanecleveland6y ago

The interesting thing with both FedEx and UPS is that they also guarantee delivery times for regular "ground" shipments, which are fairly prone to delays. The exceptions are circumstances out of their control, such as weather or recipient issues. And they suspend during a couple holiday windows. But most shippers either don't realize there are guarantees or don't have the means to efficiently track them.

schappim6y ago

Mine was hardware and software related.

I built a WebUSB Postal Scale and WebUSB Label Printer so our e-commerce company could print carrier shipping labels with just one click.

It took the process of fulfilling an order down to ~10 seconds per order.

theSage6y ago

Wrote a simple fizzbuzz server which brought down the time we spent interviewing freshers for internships/jobs. Since we're a small team, this had a big impact.

atomashpolskiy6y ago

My last job was at a company that develops one of the most popular mobile MMO action games in the world (with hundreds of millions of installs). It stores data in large Cassandra clusters (depending on the platform, DCs contain up to hundred nodes).

What I did was designing and developing a command line utility/daemon for performing one-off and regular backups of production data. The solution is able to:

- work with a 24/7 live Cassandra cluster, containing tens of nodes

- exert tolerable and tuneable performance/latency footprint on the nodes

- backup and restore from hundreds of GBs to multiple TBs of data as fast as possible, given the constraints of the legacy data model and concurrent load from online players; observed throughput is 5-25 MB/s, depending on the environment

- provides highly flexible declarative configuration of the subset of data to backup and restore (full table exports; raw CQL queries; programmatic extractors) with first-class support for foreign-key dependencies between extractors, compiled into a highly parallelizable execution graph

There was an "a-ha!" moment, when I realized, that this utility can be used not only for backups of production data, but for the whole range of day-to-day maintenance tasks, e.g.:

1) Restore a subset of production data onto development and test machines. This solves the issue of developers and QA engineers having to fiddle with the database, when they need to test something, whether it be a new feature or a bugfix for production. They can just restore a small subset of real, meaningful and consistent data onto their environment with just a bit of configuration and a simple command. Developers may do this manually when needed, and QA environment can be restored to a clean state automatically by CI server at night.

2) Perform arbitrary updates of graphs of database entities. It's a common approach to traverse Cassandra tables, possibly with a column filter, in order to process/update some of the attributes (e.g. iterate through all users and send a push notification to each of them). The more users there are, the longer it takes, and negatively affects the cluster's performance and latency for other concurrent operations. Having a tool like I described, one may clone user data onto a separate machine beforehand (e.g. at night), and then just run the maintenance operation somewhere during the day, on data that it is still reasonably up-to-date.

All in all, it was a fun experience of devops, which I'm quite fond of. With just a little creativity and out-of-the-box thinking, there are lots of ways to improve the typical workflow of working with data.

j / k navigate · click thread line to collapse

29 comments

z3ugma6y ago

The whole thing was built with read-only SQL scripts, Flask, and some JQuery.

suramya_tomar6y ago

This sounds very interesting and useful. I was thinking about making something similar but with my personal contacts & people I meet socially.

Is your code opensource or only for internal use?

sloaken6y ago

I agree, I think thats a great idea. I would like to put it on at my place. Mostly for me, as I can never remember people and its not that big of a company but there are so many people I do not know.

stephenbez6y ago

Nice. I've done a similar thing using spaced repetition software like Anki to practice everyone's name right after I join a company.

reacharavindh6y ago

He is happy and so am I that I could save his time. It only took me 20 mins to write this script.

thedevindevops6y ago

There genuinely needs to be more of this sort of thing.

digitalsushi6y ago

MediumD6y ago

Back at my old job, people would have trouble knowing what to do when on-call.

sethammons6y ago

Nifty. This should integrate with a service runbook.

i5h4n6y ago

Was a pretty cool learning experience to build that out. I had deployed it on a personal remote VM that devs were granted, have no idea if people are still using it.

tehlike6y ago

Adamantcheese6y ago

actionowl6y ago

- No single aspec ratio

- Some photos had no one in it (picture of a chair, etc)

- Some photos had multiple people in the photo (!?)

- Some photos were of such poor quality that you couldn't make out the person.

It seemed some locations let the students provide their own photo. This is the first time we'd ever encountered data in this shape.

My company had two options: Print the data as-is (which would result in thousands of reprints) or hire some temp staff to sort through the photos.

stevekemp6y ago

In the past week I've written a broken-link checker, in perl, to sanity-check the output of a static-site-generator.

I've also written a trivial PHP parser which was designed to match up class-definitions with comments above them:

https://blog.steve.fi/parsing_php_for_fun_and_profit.html

Both of these tools were designed to be invoked by CI/CD systems, to flag potential problems before they became live.

rahulrrixe6y ago

s66qnf926y ago

This is great! We're working on code generation from class definitions right now.

Any good resources worth looking at?

rahulrrixe6y ago

I started by checking how you can write HTML using Kotlin DSL. Here is the source code https://github.com/Kotlin/kotlinx.html

ioddly6y ago

Not familiar with GPs use case but TypeScript has a glorious compiler API that could handle the heavy lifting here.

solumos6y ago

When our company was doing more active Go development, a colleague and I built Charlatan.

https://github.com/percolate/charlatan

Ended up saving us a lot of time writing mocks for tests.

cyanide9116y ago

Python 3+:

Blue - A dead simple event based workflow execution framework.

Works well for us for simple use cases (eg. data processing workflows) and complex ones (eg. our entire retail order fulfilment system).

raihansaputra6y ago

dhruvkar6y ago

I wrote a shipping container tracking system for ~7 shipping lines.

The script collects relevant container numbers from the Google spreadsheet, scrapes the update and the scrapes the ERP system to enter the update.

Random_Person6y ago

shanecleveland6y ago

Many other specialized calculators and templates, which tend to be more foolproof than Excel.

schappim6y ago

I did the same in Australia. When I ran the script to request refunds for missed Express Post SLAs, we were 90% of Auspost’s inquiries for the day and got back thousands.

shanecleveland6y ago

schappim6y ago

Mine was hardware and software related.

I built a WebUSB Postal Scale and WebUSB Label Printer so our e-commerce company could print carrier shipping labels with just one click.

It took the process of fulfilling an order down to ~10 seconds per order.

theSage6y ago

Wrote a simple fizzbuzz server which brought down the time we spent interviewing freshers for internships/jobs. Since we're a small team, this had a big impact.

atomashpolskiy6y ago

What I did was designing and developing a command line utility/daemon for performing one-off and regular backups of production data. The solution is able to:

- work with a 24/7 live Cassandra cluster, containing tens of nodes

- exert tolerable and tuneable performance/latency footprint on the nodes

There was an "a-ha!" moment, when I realized, that this utility can be used not only for backups of production data, but for the whole range of day-to-day maintenance tasks, e.g.:

j / k navigate · click thread line to collapse