How setting the TZ environment variable avoids thousands of system calls (opens in new tab)

(blog.packagecloud.io)

472 pointsjcapote9y ago143 comments

143 comments

> In other words: your system supports calling the time system call via the Linux kernel’s vDSO to avoid the cost of switching to the kernel. But, as soon as your program calls time, it calls localtime immediately after, which invokes a system call anyway.

This reminds me of an article by Ted Unangst[1], in which he flattens the various libraries and abstractions to show how xterm (to cite one of many culprits) in one place is effectively doing:

        if (poll() || poll())
        while (poll()) {
             /* ... */
        }

In other words, if you don't know what your library/abstraction is doing, you can end up accidentally duplicating its work.

Reminds me of some aphorism, "Those who do not learn from history..." ;)

[1] http://www.tedunangst.com/flak/post/accidentally-nonblocking

discussed https://news.ycombinator.com/item?id=11847529

nsxwolf9y ago

Those who quote George Santayana are condemned to repeat him.

andrewbinstock9y ago

Those who don't know George Santayana are condemned to repeat him. FTFY.

1 more reply

segmondy9y ago

With the layering of clusters, containers, micro services, bet you probably have 10x worse than that. There is always a cost to abstraction. On the surface it might make things simpler but if you were to peel it apart, you would reveal a hidden layer of complexity. Hopefully, it's done well right enough that there will never be a need to peel it apart.

nomel9y ago

> On the surface it might make things simpler but if you were to peel it apart, you would reveal a hidden layer of complexity.

Well yes, This is the very definition and goal of abstraction.

jwhitlark9y ago

This is why I always liked the idea of Unikernels, they let us reset our abstractions without giving up all we've learned in the last couple decades.

JdeBP9y ago

Xe just did mplayer as well. It calls non-blocking select(), then non-blocking poll(), then nanosleep(), in a loop.

* http://www.tedunangst.com/flak/post/mplayer-ktracing

tytso9y ago

System calls in Linux are really fast. So saving "thousands" of system calls when /etc/localtime is in cache doesn't actually save that much actual CPU time.

I ran an experiment where I timed the runtime of the sample program provided in the OP, except I changed the number of calls to localtime() from ten times to a million. I then timed the difference with and without export TZ=:/etc/localhost. The net savings was .6 seconds. So for a single call to localtime(3), the net savings is 0.6 microseconds.

That's non-zero, but it's likely in the noise compared to everything else that your program might be doing.

cbsmith9y ago

> System calls in Linux are really fast. So saving "thousands" of system calls when /etc/localtime is in cache doesn't actually save that much actual CPU time.

"fast" is a relative term, and is somewhat orthogonal to "efficient".

There's a reason why certain functions use a vDSO. If you're just going to use a syscall anyway, there's kind of no point.

deathanatos9y ago

You're assuming that all cases where the vDSO call is made gets paired with a real syscall; that's simply not the case. There are plenty of calls in a server that won't need localtime (basically, anything that just needs the current time in UTC: best-practice code should not be looking at the machine's TZ setting¹). Look at the examples the article's author offers:

> formatting dates and times

This shouldn't require a call to localtime; more explanation on the part of the article is required here. Breaking a seconds-since-epoch out into year/mo/day/etc. is "simple" math, and shouldn't require a filesystem access. Something else is amiss here.

> for everything from log messages

You're about to hit disk; a cache'd stat() isn't going to matter.

> to SQL queries.

You're about to hit the network; a cache'd stat() isn't going to matter.

(Now, I'm not saying you shouldn't set TZ; if it saves some syscalls, fine, and it might be the only sane value anyways.)

¹one of my old teams had an informal rule that any invocation of datetime.datetime.now() was a bug.

2 more replies

dsl9y ago

On your base system, yes. Lots of things can hook random syscalls, or environments might have syscall monitoring.

One example is the folks over at slack record every syscall for security auditing. https://slack.engineering/syscall-auditing-at-scale-e6a3ca8a...

tytso9y ago

Slack uses the Linux audit subsystem which is also certainly faster than you think it is. Consider how many system calls your typical application is issuing --- especially ones that are likely to be calling localtime() all the time, such as a web server. If system call auditing had that high of an overhead, everything would be horrifically slow --- but it isn't, because Linux audit sends its records out asynchronously and in batches.

1 more reply

rpcope19y ago

This might be true for your system and libc, where the system calls make use of things like vDSO for gettimeofday go fast, but in general this isn't guaranteed at all. Even on x64, for certain libc implementations, like musl, if I recall correctly, syscalls are made the old fashioned way by trapping 0x80, which would mean you would see a much bigger effect by reducing the number of syscalls.

tytso9y ago

There is no vDSO for calls to stat(2). The claim in the article was that by setting the TZ environment variable to ":/etc/localtime", one could save "thousands" of stat system calls. Even for old-fashioned system calls where you use trap 0x80, Linux is still amazingly fast.

This can actually be a problem, since there are applications like git which assume stat is fast, and so it aggressively stat's all of the working files in the repository to check the mod times to see if anything has changed. That's fine on Linux, but it's a disaster on Windows, where the stat system call is dog-slow. Still, I'd call that a Windows bug, not a git bug.

2 more replies

amluto9y ago

Not quite. On x86_32, for complicated and ultimately ridiculous but nevertheless valid reasons, lots of syscalls on musl use int $0x80. I have a patch to make this fixable but Linus shot it down. Maybe I should try again.

On x86_64, syscalls only use SYSCALL. It's very fast if audit and such are off and reasonably fast otherwise. (I extensively rewrote this code recently. Older teardowns of the syscall path are dated.)

nwmcsween9y ago

http://git.musl-libc.org/cgit/musl/tree/arch/x86_64/syscall_...

raverbashing9y ago

System calls in x86 are fast. Other archs behave differently. And the syscall time is not the only thing that matters, but potentially yielding execution

philsnow9y ago

I thought they were fast because x86 has multiple register files, enough for kernel space and user space to have their own, so that entry/exit to system calls doesn't require flushing registers to L1 (in the common case).

If that's true, then one test where you have a single process spinning into and out of a single syscall will have very different performance characteristics than a test where you have more processes than processor cores, because context switches flush the TLB.

Somebody who knows actual things about x86 and so forth please tell me if I'm spouting 90s-era comp sci architecture textbook stuff that no longer applies.

2 more replies

kingosticks9y ago

I did the same experiment on a Raspberry Pi 2. The net saving was 5.803 seconds, so 5.803 microseconds per call.

Obviously if you care about performance then you wouldn't be running your program on a Raspberry Pi in the first place. But for everything else there's this free speed up.

mfukar9y ago

I build a bunch of home automation stuff (as a hobby) using Pis and other microcontrollers. Performance in those things translates almost directly to power savings, and is very desirable.

OTOH, I've never encountered an issue like this on those systems.. (yet)

mfukar9y ago

System calls in Linux are not faster than not doing them.

peterwwillis9y ago

Yeah, this is a perfect example of micro-optimization being unnecessary. Not only will you not see performance issues from this in the real world, it might cause problems down the road, because since it isn't set by default this way, some apps may not expect it and behave erroneously.

But it's neat information to have in the back of your head.

ishtu9y ago

Unnecessary? I had a really bad experience with ancient skype version on modern Ubuntu desktop, and the fix for this was to set TZ environment variable to speedup first login/history fetch. Skype process was spending so much time doing useless work it was noticeable.

__jal9y ago

That's just not possible to authoritatively state. The best you can do is "this shouldn't normally cause a noticeable impact on most systems".

As just one example, what you're stat()ing over NFS with a busy, flaky and/or distant server? A bit of thought and you'll come up with a bunch of other times it suddenly starts to matter.

andrelaszlo9y ago

I did the same, but with 10M iterations:

    $ time ./tz     
    ./tz  2,24s user 6,28s system 98% cpu 8,612 total
    $ export TZ=:/etc/localtime
    $ time ./tz                
    ./tz  1,35s user 0,00s system 98% cpu 1,364 total

So 0.7 microseconds on my machine.

vesinisa9y ago

> TZ=:/etc/localhost

Hope this is just a typo in your comment, not the actual test ;)

deathanatos9y ago

This isn't a typo, but is part of the syntax used by the TZ variable. (The same format appears in the article itself.)

See `man timezone` on a Linux system[1]. Specifically, see the passage that I've quoted below. Note that this is the third of three different formats that the man page describes that you can use in TZ:

> The second format specifies that the timezone information should be read from a file:

    :[filespec]

> *If the file specification filespec is omitted, or its value cannot be interpreted, then Coordinated Universal Time (UTC) is used. If filespec is given, it specifies another tzfile(5)-format file to read the timezone information from. If filespec does not begin with a '/', the file specification is relative to the system timezone directory. If the colon is omitted each of the above TZ formats will be tried.

[1]: https://linux.die.net/man/3/timezone

1 more reply

pquerna9y ago

Good blog post explaining the behavior of glibc, I also saw this first hand when profiling Apache awhile back too:

http://mail-archives.apache.org/mod_mbox/httpd-dev/201111.mb...

https://github.com/apache/httpd/blob/trunk/server/util_time....

The internals of glibc can often be pretty surprising sometimes, I'd really encourage people to go spelunking into the glibc source when they are profiling applications.

brendangregg9y ago

Please quantify the speedup (I've found this before, but it's never been a significant issue). Eliminating unnecessary work is great, but what are we really talking about here? Use a CPU flamegraph, Ctrl-F and search for stat functions. It'll quantify the total on the bottom right.

brendangregg9y ago

Oh, and another page that recommends strace without warning about overheads. Dangerous.

Daviey9y ago

Honestly, the primary reason I support this is to get developers out of the habbit of demanding a localized server timezone. As an infra' person, I want system time in UTC. If developers get in the habbit of setting TZ, then I can have this!

int_19h9y ago

It feels like any code that needs to know the timezone of the server is inherently wrong. If timezone ever comes up in any context, it's either the timezone of the client from whom the request originates - in which case it should come as part of the request - or else the timezone somehow associated with the business process (e.g. "warehouse open 8-5 Eastern time"), in which case it should be part of the configuration for that one service.

jdamato9y ago

Author of the post here: greetings.

If you enjoyed this post, you may also enjoy our deep dive explaining exactly how system calls work on Linux[1].

[1]: https://blog.packagecloud.io/eng/2016/04/05/the-definitive-g...

rootbear9y ago

Is there a reason why the path to the timezone file is prefixed with a colon?

TZ=:/etc/localtime

I've set TZ sometimes without the colon and it seem to work. I did a quick online search and didn't find anything relevant.

avar9y ago

:<whatever> means "read it from the <whatever>" file. See the last part of the relevant glibc documentation: https://www.gnu.org/savannah-checkouts/gnu/libc/manual/html_...

However the reason it works without : is that the implementation is being lazy and just ignores the : delimiter and falls back to parsing out a filename either way:

https://sourceware.org/git/?p=glibc.git;a=blob;f=time/tzset....

rootbear9y ago

You beat me to it. I was answering my own question when one of my users came in with a problem. Stupid users...

rootbear9y ago

Here is the answer:

https://www.gnu.org/software/libc/manual/html_node/TZ-Variab...

  The third format looks like this:

  :characters

  Each operating system interprets this format differently; in the GNU C
  Library, characters is the name of a file which describes the time zone.

The other formats specify the timezone directly, such as EST+5EDT. Interestingly, it seems to work okay without the colon. Perhaps the leading slash implies a filename?

1 more reply

unwind9y ago

See https://news.ycombinator.com/item?id=13704054. The colon forces the file to be loaded, without it the other formats are tried first.

snowcrshd9y ago

Brendan Gregg wrote about this a few years ago [1].

My favorite part:

> WTF?? Why is ls(1) running stat() on /etc/localtime for every line of output?

[1] http://www.brendangregg.com/blog/2014-05-11/strace-wow-much-...

glandium9y ago

What is missing in this post is:

- Why does glibc check /etc/localtime every time localtime is called? Wild guess: so that new values of /etc/localtime are picked at runtime without restarting programs.

- Corollary: why does glibc not check /etc/localtime every time localtime is called, when TZ is set to :/etc/localtime? Arguably the reason above should still apply when TZ is set to a file name, shouldn't it?

glandium9y ago

For the second question: There doesn't seem to be an explicit reason for the difference of treatment. The code that does it has been there since 1996, and hasn't changed since. The only reason given is "Caching happens based on the contents of the environment variable TZ.".

https://sourceware.org/git/?p=glibc.git;a=commit;h=68dbb3a69...

I'd argue it should cache the same when both old_tz and tz are NULL (but start with an old_tz that is not NULL).

I was about to file an upstream bug, but found https://sourceware.org/bugzilla/show_bug.cgi?id=5184 and https://sourceware.org/bugzilla/show_bug.cgi?id=5186

The latter actually implies the opposite should be happening: files given in TZ should be stat()ed just as much as /etc/localtime.

cestith9y ago

This is due to the multiuser nature of Unix-like systems.

/etc/localtime is set by the administrator. It may change without notice to the user.

TZ is part of the user's environment and the user sets it. All applications run by the user should honor the user's wishes if the user's not falling back to system defaults.

If you're setting TZ for yourself, your libc can update things when you update the variable and restart any applications you're running under the old value. It can therefore save cycles. If you're falling back to the system default that's not under the same user's control, then it must be ready to deal with unexpected changes.

jdamato9y ago

Hi, both are answered in the article:

First:

> What’s going on here is that the first call to localtime in glibc opens and reads the contents of /etc/localtime. All subsequent calls to localtime internally call stat, but they do this to ensure that the timezone file has not changed.

and second: read the section titled "Preventing extraneous system calls" for the answer to your second question.

glandium9y ago

Those "answers" are more about how than why.

1 more reply

jonathonf9y ago

If this has a real-world/measurable/etc. impact why isn't this set by default? Are there potential side-effects? Is it set in some distros but not others?

peterwwillis9y ago

Portability, compatibility. The system should not set environment variables if there is a reasonable default action. Env is intended to be set by the user.

In general, the timezone is set during OS setup, and the system is left in a state where it's up to the applications to figure out what to do. For example, you might configure Apache (yes, I am old, leave me alone) to use a particular timezone. But if Apache senses an env var it may choose to override the configured value with what's in the env var. Or SSH might be configured to pass along all env vars, including TZ, which in all honesty it probably won't even if you tried, but it could, and then the destination server's application has the wrong timezone.

Point is, it's safer not to mess with env vars unless you need to.

noselasd9y ago

The default behavior is done so programs don't need to be restarted if the timezone is changed, which also has real world impact.

mixologic9y ago

Probably because while there may be tens of thousands of additional syscalls, the total amount of added latency and resources consumed are more likely to be on a scale of micro/nano/milli seconds.

shawnz9y ago

In the trace you can see that the syscall takes less than a tenth of a millisecond. I don't think this is a big penalty to check if I have changed my timezone or not, as unlikely as that is during normal operation.

leovonl9y ago

This seems to be a simple RTFM issue to me: POSIX specifies that gmtime() uses UTC and localtime() uses current timezone. Using gmtime() would implement the desired behaviour without any need to hardcode environment variables.

mwexler9y ago

...which fixes all the code you wrote, but of course, you may have legacy binaries that you don't have access to source to change... hence a simple setting of an environment variable, hardcoded though it may be, fixes the situation for all.

Though PeterWillis makes a good point akin to yours, and your (plural) point does make sense.

(Edit: added mention of comment with additional background on why to avoid hardcoding the variable)

rdtsc9y ago

Great post. I remember when vDSOs were added we noticed a nice speedup in our code. We tuned for realtime and a few microseconds here and there add up. Most importantly, less systems calls means more predictability.

blunte9y ago

This reminds me of a very similar behavior in Solaris over 20 years ago. Our C application was having odd performance problems on some client systems, and eventually we saw via truss that there were hundreds of fopen() calls every second to get the timezone. Setting the right environment variable solved the problem.

kelnos9y ago

I really enjoy when people dig into things like this and report their findings. Having said that, I question the wisdom of "bothering" with this sort of thing. Everything you do that's non-standard or works against a system's default behavior incurs a cost. It's yet another thing you have to replicate when you migrate to a new version, change provisioning systems, etc.

And for what benefit? A few hundred syscalls per second? Linux syscalls are fast enough that something of that magnitude shouldn't matter much. Given that /etc/localtime will certainly be in cache with that frequency of access, a stat() should do little work in the kernel to return, so that won't be slow either.

It's good that they did some benchmarking to look at the differences, but this feels like a premature optimization to me. I can't imagine that this did anything but make their application a tiny fraction of a percent faster. Was it worth the time to dig into that for this increase? Was it worth the maintenance cost I mention in my first paragraph? I wouldn't think so.

I'm really trying not to take a crap on what they did; as I said, it's really cool to dig into these sorts of abstractions and find out where they're inefficient or leak (or just great as a learning exercise; we all depend on a mountain of code that most people don't understand at all). But, when looked at from a holistic systems approach, a grab bag of little "tweaks" like this can become harmful in the long run.

lotyrin9y ago

You should be able to get maintenance for something like this pretty low. Add a line with a nice comment to the config template for running your apps that adds an extra env var. Any machine that runs any app now has the line...

I would definitely like to see a before/after real-world metric on impact here though.

bradfa9y ago

The embedded Linux system I'm working on right now takes about 17 us per stat() call due to this but time will always be kept internally as UTC, so taking advantage of this is worth considering for me.

Since most embedded systems can directly translate the amount of processing needed to achieve the product goal into a real dollar cost of the hardware, any savings, even a small one, is worth investigating. Since the hardware and system are generally well understood, implementing something like this is much more reasonable.

But I agree, on a general purpose OS doing general purpose thing, an optimization like what's proposed by the article may not be worth the other tradeoffs.

scottlamb9y ago

There's another easy way to avoid this: use localtime_r instead of localtime. From the glibc source:

    /* Update internal database according to current TZ setting.
       POSIX.1 8.3.7.2 says that localtime_r is not required to set tzname.
       This is a good idea since this allows at least a bit more parallelism.  */
    tzset_internal (tp == &_tmbuf && use_localtime, 1);

mktime also does the tzset call every time, though:

    time_t
    mktime (struct tm *tp)
    {
    #ifdef _LIBC
      /* POSIX.1 8.1.1 requires that whenever mktime() is called, the
         time zone names contained in the external variable 'tzname' shall
         be set as if the tzset() function had been called.  */
      __tzset ();
    #endif

and I don't see any way around that other than setting TZ=: or some such.

rocky11389y ago

It is perhaps out of scope of the article, but it sure would have been helpful to show how to set the TZ environment variable and what to set it to.

mmozeiko9y ago

Its right there in middle of article (under "Preventing extraneous system calls" section):

    $ TZ=:/etc/localtime strace -ttT ./test

rocky11389y ago

I can't believe I missed that! Thanks.

falsedan9y ago

Why did this post start with a tl;dr, then a Summary, and _still_ buried the important takeaway in the ultimate paragraph?

rtsisyk9y ago

BTW, packagecloud.io is the great hosting for RPM/DEB packages. We've been using it for the last couple years. GitHub + Travis CI + PackageCloud combination allows us build and publish packages for EVERY git commit in 30+ repositories targeting 15 different Linux distributions [1]. There is no more need to hire a special devops guy for that.

[1]: https://github.com/packpack/packpack#packpack

hodgesrm9y ago

Interesting article but I can't reproduce the behavior on Ubuntu 16.01 LTS. I don't have TZ set (or anything locale-related for that matter). Here are the library dependencies:

  $ ldd test
  linux-vdso.so.1 =>  (0x00007ffd80baf000)
  libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f8844bf7000)
  /lib64/ld-linux-x86-64.so.2 (0x00007f8844fbc000)

Any thoughts why the behavior would be different?

cnvogel9y ago

There's also a surprising difference in behavior between tm = localtime() and localtime_r(..., &tm).

The former is the "traditional" function which returns a pointer to a statically allocated, global "struct_tm". The latter is the thread-safe version receiving a pointer to a use-supplied "struct tm" as it's second argument.

    :   do {
    :           t = time(NULL);
    :           localtime_r(&t, &tm);
    :           printf("The time is now %02d:%02d:%02d.\n",
    :                  tm.tm_hour, tm.tm_min, tm.tm_sec);
    :           sleep(1);
    :   } while(--N);

with TZ set to Europe/Berlin, set to :/etc/localtime, or unset I never get a stat on anything.

    write(1, "The time is now 07:23:33.\n", 26The time is now 07:23:33. ) = 26
    nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffd9e798470) = 0
    write(1, "The time is now 07:23:34.\n", 26The time is now 07:23:34. ) = 26
    nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffd9e798470) = 0
    write(1, "The time is now 07:23:35.\n", 26The time is now 07:23:35. ) = 26
    nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffd9e798470) = 0

If I change it to tm = localtime()...

    stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2335, ...}) = 0
    write(1, "The time is now 07:30:56.\n", 26The time is now 07:30:56.) = 26
    nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffc868c3010) = 0

One more reason to switch to the reentrant/thread-safe versions of those ugly library functions :-).

Note, this is using glibc 2.24 under Arch.

    $ /lib/libc.so.6
    GNU C Library (GNU libc) stable release version 2.24, by Roland McGrath et al.
    (...)
    Compiled by GNU CC version 6.1.1 20160802.

acscott9y ago

This reminds me of setting noatime for disk mounts (http://askubuntu.com/questions/2099/is-it-worth-to-tune-ext4...)

Now I want to know the number of other configs to reduce the number of system calls. This all adds up to being significant the greater the number of hosts in your environment.

actuator9y ago

While trying to find the cause of slowness in Rails requests, I was running strace on an unicorn process when I encountered the same thing mentioned in the article.

Rails instrumentation code calls current time before and after any instrumentation block. So, when I looked at the trace there were a lot of `stat` calls coming for `/etc/localtime` and as stat is an IO operation, I thought I discovered the cause of slowness(which I attributed to high number of IO ops) but surprisingly when I saw the strace method summary; while the call count was high, the time taken by the calls in total was not significant(<1% if I remember correctly). So I decided to set TZ with the next AMI update 15 months back but forgot about it totally. I guess I should add it to my Trello list this time.

Also, I think he should have printed the aggregate summary of just CPU clock time(`-c`) as well as that is usually very low.

caf9y ago

...while the call count was high, the time taken by the calls in total was not significant(<1% if I remember correctly).

Yes, on ordinary filesystems if you run stat() over and over again on the same file then it's just copying from the in-memory inode into your struct stat, there's no IO.

astrostl9y ago

I do this for "not get annoyed while stracing" reasons, not perf!

rargulati9y ago

Really interesting - thanks for sharing the findings. I haven't seen it mentioned here, but for those of us using `timedatectl` via systemd, with the default setting of `UTC` are taking advantage[1] of the recommendation in the article.

[1] https://github.com/systemd/systemd/blob/master/src/timedate/...

cbsmith9y ago

This reads to me like a glibc bug. Glibc should just be watching "/etc/localtime" for changes, rather than calling out to hundreds of times a second.

geofft9y ago

It's hard to do that without any system calls, though.

The way you'd do this is to open an inotify (or platform equivalent) file descriptor on the first call to localtime(), and on future calls, only bother statting /etc/localtime again if that file descriptor reports something has changed. But checking if an FD has data requires a system call; you'd do a non-blocking read on the FD (or a non-blocking select or poll, or an ioctl(FIONREAD)). It's possible that system call is faster than stat, but it's still a system call.

You could do it with a thread that does a blocking read, but that's a mistake under the current UNIX architecture: lots of stuff (signal delivery and masks, for instance) gets weird as soon as you have threads, so unconditionally sticking a thread into every process using libc is a bad plan.

You could do it with fcntl(F_SETSIG), which would send you a signal (SIGIO by default) when the inotify file descriptor is readable, but you couldn't actually use SIGIO, since the user's process might have a handler. You'd need to steal one of the real-time signals and set SIGRTMIN = old SIGRTMIN + 1. (See https://github.com/geofft/enviable for a totally-non-production-suitable example of this approach.) This would probably mostly work, except changing SIGRTMIN is technically an ABI break (since programs communicate with each other with numerical signal values). You could maybe use one of the real-time signals that pthread already steals. Also, using signals is sort of a questionable idea in general; user programs now risk getting EINTR when /etc/localtime changes, which they're likely not to be prepared for. A single-threaded glibc program that sets no signal handlers never gets any EINTRs, which is nice.

In an ideal world, every program would have a standard event loop / mechanism for waiting on FDs, and libc could just register the file descriptor with that event loop. Then you'll get notified of /etc/localtime changing after the next run of the event loop, which is good enough, and it would take zero additional system calls. But unfortunately, UNIX wasn't designed that way, so something at libc's level can't assume the existence of any message loop. A higher-level library like GLib or Qt or libuv could probably do this, though.

curlypaul9249y ago

What about calling mmap to map /etc/localtime into memory? The file would still have to be parsed for every call to localtime(), but the system call is avoided.

(Better yet, if it were possible to memory-map the directory entry for /etc/localtime, parsing could be avoided as well).

I agree the best solution is to provide a more sensible API. Why should an application be limited to only _one_ timezone?

2 more replies

noselasd9y ago

There's no usable mechanisms glibc can use for watching /etc/localtime for changes that does not mess up the program if it also decides to use any file watching features.

cbsmith9y ago

At least on key platforms, it's pretty easy to use an event driven model and watch for updates. Hell, if the vDSO handled TZ it'd be no problem.

1 more reply

gumby9y ago

And how does it watch it? With the stat() syscall.

Dylan168079y ago

Polling is not watching.

drudru119y ago

Side note - why do some sites completely hide information about who is behind them? I couldn't find a single thing about that on their blog or main site.

avar9y ago

It's not in any way hidden, just run a whois query:

    $ whois packagecloud.io|grep Owner
    Owner Name    : JOSEPH DAMATO
    Owner OrgName : COMPUTOLOGY, LLC
    Owner Addr    : 359 FILLMORE ST!12
    Owner Addr    : SAN FRANCISCO
    Owner Addr    : CA
    Owner Addr    : US

jasonmp859y ago

Maybe it's just that I use packagecloud and follow people in their circle on Twitter, but Joe Damato is the CEO and founder: https://twitter.com/joedamato

I'm under the impression that he may also write a lot these himself? Not entirely certain, though.

jcdavis9y ago

Pretty sure he writes all of the linux internals posts. He also has a bunch of great ones on his blog http://timetobleed.com/

jlg239y ago

All the links are in the footer of the page.

ryanlol9y ago

For what it's worth, they disappear if your browser isn't wide enough.

1 more reply

kseistrup9y ago

It would be highly inconvenient to have to set this variable if you live in a country where you change the timezone twice a year due to summertime.

alexfoo9y ago

If TZ is set properly then you don't need to change it twice a year.

(man tzset for more info and examples of different values TZ can be set to. Mine is set to "Europe/London" which handles the DST switches automatically.)

kseistrup9y ago

Hm…, now that I read TZSET(3): Wouldn't that be

    TZ=:Europe/London

It doesn't seem tzset() will accept the format

    TZ=Europe/London

?

1 more reply

kseistrup9y ago

Ah yes, thanks for pointing that out!

stormbrew9y ago

Are you talking about DST? You don't change your timezone at DST switch, the timezone data knows how to calculate the correct time based on your location (roughly) and the time of year. Mountain standard time and mountain daylight time are both part of mountain time, for example.

kseistrup9y ago

What I meant was: Currently my timezone is CET. In a month's time it will be CEST. If I have to set TZ to explicitly mirror that it will be a burden.

1 more reply

zeveb9y ago

Do all your work in UTC. Seriously, you'll be glad later.

Convert to local time only on the edges, and only for end users.

kseistrup9y ago

All my VPSes run in UTC, no exception. What I'm talking about here is my desktop machine. I strongly prefer to run that in localtime.

1 more reply

deathanatos9y ago

Starting or stopping Daylight Saving Time is not a timezone change. A timezone is — roughly — a set of timekeeping rules that some set of people use. DST is just part of those timekeeping rules. That is, your timezone is not the same thing as your UTC offset.

For example, the timezone America/New_York contains information not only about the UTC offset, but the offset during and not during DST, and when DST starts and ends. (And historical starts and ends to, so that UTC → local conversions (and vice versa) use the rules that were present at that time, not the rules that are present now, which may be different.)

E.g., my home desktop runs in the timezone America/Los_Angeles all year around. Most of my servers run in the timezone UTC all year. Both always have the appropriately correct time.

manarth9y ago

TZs should be defined in the form "Europe/London", "Asia/Beirut", "Pacific/Auckland", etc.

Although the hour offset can change at extremely short notice (e.g. discontinuing Daylight Savings during Ramadan [1]), the timezone declaration (e.g. "Africa/Casablanca"), shouldn't need to change, just the underlying timezone database.

[1] https://en.wikipedia.org/wiki/Daylight_saving_time_in_Morocc...

kseistrup9y ago

That's not how I read TZSET(3). According to TZSET(3) you can either use (example for Copenhagen shown)

    TZ=CET

    TZ=:Europe/Copenhagen

but not

    TZ=Europe/Copenhagen

1 more reply

barrystaes9y ago

Im not an expert but the first thing that comes to mind is that 1) TFA does not quantify the performance gain in time 2) I wonder if environment variables like TZ are a security risk/vector in that these might facilitate attackers to stealthy skew/screw time within current user process... no root required.

jakeogh9y ago

OpenRC users can:

  echo 'TZ=:/etc/localtime' > /etc/env.d/00localtime

JensRex9y ago

For Debian and derivatives:

    echo 'TZ=:/etc/localtime' >> /etc/environment

rtsisyk9y ago

Overhead of localtime() is well-known, just RTFM. Anyway, this article provides very good explanation.

creeble9y ago

Does anyone have any evidence of this actually having a resource usage impact on any common programs?

I see one reference to Apache below, but not whether it actually made a measurable difference.

vesinisa9y ago

This was a thoroughly fascinating read. Highly recommend reading the previous part in the series as well.

mozumder9y ago

Does this affect FreeBSD as well?

loeg9y ago

Experimentally, no. The example program calls localtime(3) 10 times but only accesses the file once, per truss:

    write(1,"Greetings!n",11)			 = 11 (0xb)
    access("/etc/localtime",R_OK)			 = 0 (0x0)
    open("/etc/localtime",O_RDONLY,037777777600)	 = 3 (0x3)
    fstat(3,{ mode=-r--r--r-- ,inode=11316113,size=2819,blksize=32768 }) = 0 (0x0)
    read(3,"TZif20000000000000"...,41448) = 2819 (0xb03)
    close(3)					 = 0 (0x0)
    issetugid()					 = 0 (0x0)
    open("/usr/share/zoneinfo/posixrules",O_RDONLY,00) = 3 (0x3)
    fstat(3,{ mode=-r--r--r-- ,inode=327579,size=3519,blksize=32768 }) = 0 (0x0)
    read(3,"TZif20000000000000"...,41448) = 3519 (0xdbf)
    close(3)					 = 0 (0x0)
    write(1,"Godspeed, dear friend!n",23)		 = 23 (0x17)

(FreeBSD caches the database on the first call: https://svnweb.freebsd.org/base/head/contrib/tzcode/stdtime/... )

rini179y ago

It is unlikely, they usually don't use GNU C library.

dpatru9y ago

It seems to me that this is something the Linux distributions should already be doing.

sandGorgon9y ago

does anyone know if this impacts docker images as well ?

pquerna9y ago

Yes, at least the Docker images that use glibc as their libc. (eg, most Debian/Ubuntu images)

It looks like musl, which is used on Alpine Linux images for example, will only read it once, and then cache it:

https://github.com/esmil/musl/blob/master/src/time/__tz.c#L1...

It has a mutex/lock around the use of the TZ info, but avoids re-stat'ing the localtime file.

nathancahill9y ago

This is the best part of HN. Not only did you answer GP's question in 6 minutes, but you link to the exact line of the source code.

noselasd9y ago

If you havn't set the TZ variable in the image, then yes.

yeukhon9y ago

I would be surprised if this doesn't. There's nothing specific about docker when running this C code from any other processes running on a Linux host.

j / k navigate · click thread line to collapse

143 comments

AceJohnny29y ago

This reminds me of an article by Ted Unangst[1], in which he flattens the various libraries and abstractions to show how xterm (to cite one of many culprits) in one place is effectively doing:

        if (poll() || poll())
        while (poll()) {
             /* ... */
        }

In other words, if you don't know what your library/abstraction is doing, you can end up accidentally duplicating its work.

Reminds me of some aphorism, "Those who do not learn from history..." ;)

[1] http://www.tedunangst.com/flak/post/accidentally-nonblocking

discussed https://news.ycombinator.com/item?id=11847529

nsxwolf9y ago

Those who quote George Santayana are condemned to repeat him.

andrewbinstock9y ago

Those who don't know George Santayana are condemned to repeat him. FTFY.

1 more reply

segmondy9y ago

nomel9y ago

> On the surface it might make things simpler but if you were to peel it apart, you would reveal a hidden layer of complexity.

Well yes, This is the very definition and goal of abstraction.

jwhitlark9y ago

This is why I always liked the idea of Unikernels, they let us reset our abstractions without giving up all we've learned in the last couple decades.

JdeBP9y ago

Xe just did mplayer as well. It calls non-blocking select(), then non-blocking poll(), then nanosleep(), in a loop.

* http://www.tedunangst.com/flak/post/mplayer-ktracing

tytso9y ago

System calls in Linux are really fast. So saving "thousands" of system calls when /etc/localtime is in cache doesn't actually save that much actual CPU time.

That's non-zero, but it's likely in the noise compared to everything else that your program might be doing.

cbsmith9y ago

> System calls in Linux are really fast. So saving "thousands" of system calls when /etc/localtime is in cache doesn't actually save that much actual CPU time.

"fast" is a relative term, and is somewhat orthogonal to "efficient".

There's a reason why certain functions use a vDSO. If you're just going to use a syscall anyway, there's kind of no point.

deathanatos9y ago

> formatting dates and times

> for everything from log messages

You're about to hit disk; a cache'd stat() isn't going to matter.

> to SQL queries.

You're about to hit the network; a cache'd stat() isn't going to matter.

(Now, I'm not saying you shouldn't set TZ; if it saves some syscalls, fine, and it might be the only sane value anyways.)

¹one of my old teams had an informal rule that any invocation of datetime.datetime.now() was a bug.

2 more replies

dsl9y ago

On your base system, yes. Lots of things can hook random syscalls, or environments might have syscall monitoring.

One example is the folks over at slack record every syscall for security auditing. https://slack.engineering/syscall-auditing-at-scale-e6a3ca8a...

tytso9y ago

1 more reply

rpcope19y ago

tytso9y ago

2 more replies

amluto9y ago

nwmcsween9y ago

http://git.musl-libc.org/cgit/musl/tree/arch/x86_64/syscall_...

raverbashing9y ago

System calls in x86 are fast. Other archs behave differently. And the syscall time is not the only thing that matters, but potentially yielding execution

philsnow9y ago

Somebody who knows actual things about x86 and so forth please tell me if I'm spouting 90s-era comp sci architecture textbook stuff that no longer applies.

2 more replies

kingosticks9y ago

I did the same experiment on a Raspberry Pi 2. The net saving was 5.803 seconds, so 5.803 microseconds per call.

Obviously if you care about performance then you wouldn't be running your program on a Raspberry Pi in the first place. But for everything else there's this free speed up.

mfukar9y ago

I build a bunch of home automation stuff (as a hobby) using Pis and other microcontrollers. Performance in those things translates almost directly to power savings, and is very desirable.

OTOH, I've never encountered an issue like this on those systems.. (yet)

mfukar9y ago

System calls in Linux are not faster than not doing them.

peterwwillis9y ago

But it's neat information to have in the back of your head.

ishtu9y ago

__jal9y ago

That's just not possible to authoritatively state. The best you can do is "this shouldn't normally cause a noticeable impact on most systems".

As just one example, what you're stat()ing over NFS with a busy, flaky and/or distant server? A bit of thought and you'll come up with a bunch of other times it suddenly starts to matter.

andrelaszlo9y ago

I did the same, but with 10M iterations:

    $ time ./tz     
    ./tz  2,24s user 6,28s system 98% cpu 8,612 total
    $ export TZ=:/etc/localtime
    $ time ./tz                
    ./tz  1,35s user 0,00s system 98% cpu 1,364 total

So 0.7 microseconds on my machine.

vesinisa9y ago

> TZ=:/etc/localhost

Hope this is just a typo in your comment, not the actual test ;)

deathanatos9y ago

This isn't a typo, but is part of the syntax used by the TZ variable. (The same format appears in the article itself.)

> The second format specifies that the timezone information should be read from a file:

    :[filespec]

[1]: https://linux.die.net/man/3/timezone

1 more reply

pquerna9y ago

Good blog post explaining the behavior of glibc, I also saw this first hand when profiling Apache awhile back too:

http://mail-archives.apache.org/mod_mbox/httpd-dev/201111.mb...

https://github.com/apache/httpd/blob/trunk/server/util_time....

The internals of glibc can often be pretty surprising sometimes, I'd really encourage people to go spelunking into the glibc source when they are profiling applications.

brendangregg9y ago

Oh, and another page that recommends strace without warning about overheads. Dangerous.

Daviey9y ago

int_19h9y ago

jdamato9y ago

Author of the post here: greetings.

If you enjoyed this post, you may also enjoy our deep dive explaining exactly how system calls work on Linux[1].

[1]: https://blog.packagecloud.io/eng/2016/04/05/the-definitive-g...

rootbear9y ago

Is there a reason why the path to the timezone file is prefixed with a colon?

TZ=:/etc/localtime

I've set TZ sometimes without the colon and it seem to work. I did a quick online search and didn't find anything relevant.

avar9y ago

:<whatever> means "read it from the <whatever>" file. See the last part of the relevant glibc documentation: https://www.gnu.org/savannah-checkouts/gnu/libc/manual/html_...

However the reason it works without : is that the implementation is being lazy and just ignores the : delimiter and falls back to parsing out a filename either way:

https://sourceware.org/git/?p=glibc.git;a=blob;f=time/tzset....

rootbear9y ago

You beat me to it. I was answering my own question when one of my users came in with a problem. Stupid users...

rootbear9y ago

Here is the answer:

https://www.gnu.org/software/libc/manual/html_node/TZ-Variab...

  The third format looks like this:

  :characters

  Each operating system interprets this format differently; in the GNU C
  Library, characters is the name of a file which describes the time zone.

The other formats specify the timezone directly, such as EST+5EDT. Interestingly, it seems to work okay without the colon. Perhaps the leading slash implies a filename?

1 more reply

unwind9y ago

See https://news.ycombinator.com/item?id=13704054. The colon forces the file to be loaded, without it the other formats are tried first.

snowcrshd9y ago

Brendan Gregg wrote about this a few years ago [1].

My favorite part:

> WTF?? Why is ls(1) running stat() on /etc/localtime for every line of output?

[1] http://www.brendangregg.com/blog/2014-05-11/strace-wow-much-...

glandium9y ago

What is missing in this post is:

- Why does glibc check /etc/localtime every time localtime is called? Wild guess: so that new values of /etc/localtime are picked at runtime without restarting programs.

glandium9y ago

https://sourceware.org/git/?p=glibc.git;a=commit;h=68dbb3a69...

I'd argue it should cache the same when both old_tz and tz are NULL (but start with an old_tz that is not NULL).

I was about to file an upstream bug, but found https://sourceware.org/bugzilla/show_bug.cgi?id=5184 and https://sourceware.org/bugzilla/show_bug.cgi?id=5186

The latter actually implies the opposite should be happening: files given in TZ should be stat()ed just as much as /etc/localtime.

cestith9y ago

This is due to the multiuser nature of Unix-like systems.

/etc/localtime is set by the administrator. It may change without notice to the user.

TZ is part of the user's environment and the user sets it. All applications run by the user should honor the user's wishes if the user's not falling back to system defaults.

jdamato9y ago

Hi, both are answered in the article:

First:

and second: read the section titled "Preventing extraneous system calls" for the answer to your second question.

glandium9y ago

Those "answers" are more about how than why.

1 more reply

jonathonf9y ago

If this has a real-world/measurable/etc. impact why isn't this set by default? Are there potential side-effects? Is it set in some distros but not others?

peterwwillis9y ago

Portability, compatibility. The system should not set environment variables if there is a reasonable default action. Env is intended to be set by the user.

Point is, it's safer not to mess with env vars unless you need to.

noselasd9y ago

The default behavior is done so programs don't need to be restarted if the timezone is changed, which also has real world impact.

mixologic9y ago

Probably because while there may be tens of thousands of additional syscalls, the total amount of added latency and resources consumed are more likely to be on a scale of micro/nano/milli seconds.

shawnz9y ago

leovonl9y ago

mwexler9y ago

Though PeterWillis makes a good point akin to yours, and your (plural) point does make sense.

(Edit: added mention of comment with additional background on why to avoid hardcoding the variable)

rdtsc9y ago

blunte9y ago

kelnos9y ago

lotyrin9y ago

I would definitely like to see a before/after real-world metric on impact here though.

bradfa9y ago

But I agree, on a general purpose OS doing general purpose thing, an optimization like what's proposed by the article may not be worth the other tradeoffs.

scottlamb9y ago

There's another easy way to avoid this: use localtime_r instead of localtime. From the glibc source:

    /* Update internal database according to current TZ setting.
       POSIX.1 8.3.7.2 says that localtime_r is not required to set tzname.
       This is a good idea since this allows at least a bit more parallelism.  */
    tzset_internal (tp == &_tmbuf && use_localtime, 1);

mktime also does the tzset call every time, though:

    time_t
    mktime (struct tm *tp)
    {
    #ifdef _LIBC
      /* POSIX.1 8.1.1 requires that whenever mktime() is called, the
         time zone names contained in the external variable 'tzname' shall
         be set as if the tzset() function had been called.  */
      __tzset ();
    #endif

and I don't see any way around that other than setting TZ=: or some such.

rocky11389y ago

It is perhaps out of scope of the article, but it sure would have been helpful to show how to set the TZ environment variable and what to set it to.

mmozeiko9y ago

Its right there in middle of article (under "Preventing extraneous system calls" section):

    $ TZ=:/etc/localtime strace -ttT ./test

rocky11389y ago

I can't believe I missed that! Thanks.

falsedan9y ago

Why did this post start with a tl;dr, then a Summary, and _still_ buried the important takeaway in the ultimate paragraph?

rtsisyk9y ago

[1]: https://github.com/packpack/packpack#packpack

hodgesrm9y ago

Interesting article but I can't reproduce the behavior on Ubuntu 16.01 LTS. I don't have TZ set (or anything locale-related for that matter). Here are the library dependencies:

  $ ldd test
  linux-vdso.so.1 =>  (0x00007ffd80baf000)
  libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f8844bf7000)
  /lib64/ld-linux-x86-64.so.2 (0x00007f8844fbc000)

Any thoughts why the behavior would be different?

cnvogel9y ago

There's also a surprising difference in behavior between tm = localtime() and localtime_r(..., &tm).

    :   do {
    :           t = time(NULL);
    :           localtime_r(&t, &tm);
    :           printf("The time is now %02d:%02d:%02d.\n",
    :                  tm.tm_hour, tm.tm_min, tm.tm_sec);
    :           sleep(1);
    :   } while(--N);

with TZ set to Europe/Berlin, set to :/etc/localtime, or unset I never get a stat on anything.

    write(1, "The time is now 07:23:33.\n", 26The time is now 07:23:33. ) = 26
    nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffd9e798470) = 0
    write(1, "The time is now 07:23:34.\n", 26The time is now 07:23:34. ) = 26
    nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffd9e798470) = 0
    write(1, "The time is now 07:23:35.\n", 26The time is now 07:23:35. ) = 26
    nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffd9e798470) = 0

If I change it to tm = localtime()...

    stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2335, ...}) = 0
    write(1, "The time is now 07:30:56.\n", 26The time is now 07:30:56.) = 26
    nanosleep({tv_sec=1, tv_nsec=0}, 0x7ffc868c3010) = 0

One more reason to switch to the reentrant/thread-safe versions of those ugly library functions :-).

Note, this is using glibc 2.24 under Arch.

    $ /lib/libc.so.6
    GNU C Library (GNU libc) stable release version 2.24, by Roland McGrath et al.
    (...)
    Compiled by GNU CC version 6.1.1 20160802.

acscott9y ago

This reminds me of setting noatime for disk mounts (http://askubuntu.com/questions/2099/is-it-worth-to-tune-ext4...)

Now I want to know the number of other configs to reduce the number of system calls. This all adds up to being significant the greater the number of hosts in your environment.

actuator9y ago

While trying to find the cause of slowness in Rails requests, I was running strace on an unicorn process when I encountered the same thing mentioned in the article.

Also, I think he should have printed the aggregate summary of just CPU clock time(`-c`) as well as that is usually very low.

caf9y ago

...while the call count was high, the time taken by the calls in total was not significant(<1% if I remember correctly).

Yes, on ordinary filesystems if you run stat() over and over again on the same file then it's just copying from the in-memory inode into your struct stat, there's no IO.

astrostl9y ago

I do this for "not get annoyed while stracing" reasons, not perf!

rargulati9y ago

[1] https://github.com/systemd/systemd/blob/master/src/timedate/...

cbsmith9y ago

This reads to me like a glibc bug. Glibc should just be watching "/etc/localtime" for changes, rather than calling out to hundreds of times a second.

geofft9y ago

It's hard to do that without any system calls, though.

curlypaul9249y ago

What about calling mmap to map /etc/localtime into memory? The file would still have to be parsed for every call to localtime(), but the system call is avoided.

(Better yet, if it were possible to memory-map the directory entry for /etc/localtime, parsing could be avoided as well).

I agree the best solution is to provide a more sensible API. Why should an application be limited to only _one_ timezone?

2 more replies

noselasd9y ago

There's no usable mechanisms glibc can use for watching /etc/localtime for changes that does not mess up the program if it also decides to use any file watching features.

cbsmith9y ago

At least on key platforms, it's pretty easy to use an event driven model and watch for updates. Hell, if the vDSO handled TZ it'd be no problem.

1 more reply

gumby9y ago

And how does it watch it? With the stat() syscall.

Dylan168079y ago

Polling is not watching.

drudru119y ago

Side note - why do some sites completely hide information about who is behind them? I couldn't find a single thing about that on their blog or main site.

avar9y ago

It's not in any way hidden, just run a whois query:

    $ whois packagecloud.io|grep Owner
    Owner Name    : JOSEPH DAMATO
    Owner OrgName : COMPUTOLOGY, LLC
    Owner Addr    : 359 FILLMORE ST!12
    Owner Addr    : SAN FRANCISCO
    Owner Addr    : CA
    Owner Addr    : US

jasonmp859y ago

Maybe it's just that I use packagecloud and follow people in their circle on Twitter, but Joe Damato is the CEO and founder: https://twitter.com/joedamato

I'm under the impression that he may also write a lot these himself? Not entirely certain, though.

jcdavis9y ago

Pretty sure he writes all of the linux internals posts. He also has a bunch of great ones on his blog http://timetobleed.com/

jlg239y ago

All the links are in the footer of the page.

ryanlol9y ago

For what it's worth, they disappear if your browser isn't wide enough.

1 more reply

kseistrup9y ago

It would be highly inconvenient to have to set this variable if you live in a country where you change the timezone twice a year due to summertime.

alexfoo9y ago

If TZ is set properly then you don't need to change it twice a year.

(man tzset for more info and examples of different values TZ can be set to. Mine is set to "Europe/London" which handles the DST switches automatically.)

kseistrup9y ago

Hm…, now that I read TZSET(3): Wouldn't that be

    TZ=:Europe/London

It doesn't seem tzset() will accept the format

    TZ=Europe/London

?

1 more reply

kseistrup9y ago

Ah yes, thanks for pointing that out!

stormbrew9y ago

kseistrup9y ago

What I meant was: Currently my timezone is CET. In a month's time it will be CEST. If I have to set TZ to explicitly mirror that it will be a burden.

1 more reply

zeveb9y ago

Do all your work in UTC. Seriously, you'll be glad later.

Convert to local time only on the edges, and only for end users.

kseistrup9y ago

All my VPSes run in UTC, no exception. What I'm talking about here is my desktop machine. I strongly prefer to run that in localtime.

1 more reply

deathanatos9y ago

E.g., my home desktop runs in the timezone America/Los_Angeles all year around. Most of my servers run in the timezone UTC all year. Both always have the appropriately correct time.

manarth9y ago

TZs should be defined in the form "Europe/London", "Asia/Beirut", "Pacific/Auckland", etc.

[1] https://en.wikipedia.org/wiki/Daylight_saving_time_in_Morocc...

kseistrup9y ago

That's not how I read TZSET(3). According to TZSET(3) you can either use (example for Copenhagen shown)

    TZ=CET

    TZ=:Europe/Copenhagen

but not

    TZ=Europe/Copenhagen

1 more reply

barrystaes9y ago

jakeogh9y ago

OpenRC users can:

  echo 'TZ=:/etc/localtime' > /etc/env.d/00localtime

JensRex9y ago

For Debian and derivatives:

    echo 'TZ=:/etc/localtime' >> /etc/environment

rtsisyk9y ago

Overhead of localtime() is well-known, just RTFM. Anyway, this article provides very good explanation.

creeble9y ago

Does anyone have any evidence of this actually having a resource usage impact on any common programs?

I see one reference to Apache below, but not whether it actually made a measurable difference.

vesinisa9y ago

This was a thoroughly fascinating read. Highly recommend reading the previous part in the series as well.

mozumder9y ago

Does this affect FreeBSD as well?

loeg9y ago

Experimentally, no. The example program calls localtime(3) 10 times but only accesses the file once, per truss:

    write(1,"Greetings!n",11)			 = 11 (0xb)
    access("/etc/localtime",R_OK)			 = 0 (0x0)
    open("/etc/localtime",O_RDONLY,037777777600)	 = 3 (0x3)
    fstat(3,{ mode=-r--r--r-- ,inode=11316113,size=2819,blksize=32768 }) = 0 (0x0)
    read(3,"TZif20000000000000"...,41448) = 2819 (0xb03)
    close(3)					 = 0 (0x0)
    issetugid()					 = 0 (0x0)
    open("/usr/share/zoneinfo/posixrules",O_RDONLY,00) = 3 (0x3)
    fstat(3,{ mode=-r--r--r-- ,inode=327579,size=3519,blksize=32768 }) = 0 (0x0)
    read(3,"TZif20000000000000"...,41448) = 3519 (0xdbf)
    close(3)					 = 0 (0x0)
    write(1,"Godspeed, dear friend!n",23)		 = 23 (0x17)

(FreeBSD caches the database on the first call: https://svnweb.freebsd.org/base/head/contrib/tzcode/stdtime/... )

rini179y ago

It is unlikely, they usually don't use GNU C library.

dpatru9y ago

It seems to me that this is something the Linux distributions should already be doing.

sandGorgon9y ago

does anyone know if this impacts docker images as well ?

pquerna9y ago

Yes, at least the Docker images that use glibc as their libc. (eg, most Debian/Ubuntu images)

It looks like musl, which is used on Alpine Linux images for example, will only read it once, and then cache it:

https://github.com/esmil/musl/blob/master/src/time/__tz.c#L1...

It has a mutex/lock around the use of the TZ info, but avoids re-stat'ing the localtime file.

nathancahill9y ago

This is the best part of HN. Not only did you answer GP's question in 6 minutes, but you link to the exact line of the source code.

noselasd9y ago

If you havn't set the TZ variable in the image, then yes.

yeukhon9y ago

I would be surprised if this doesn't. There's nothing specific about docker when running this C code from any other processes running on a Linux host.

j / k navigate · click thread line to collapse