table inet raw {
...
counter postgre_tcp_traffic_out {
packets 0 bytes 0
}
...
chain output {
...
meta l4proto tcp skuid postgres counter name "postgre_tcp_traffic_out" notrack
...
}
}
and then view it like this: nft -j list counters | jq '.'
...
"counter": {
"family": "inet",
"name": "postgre_tcp_traffic_out",
"table": "raw",
"handle": 20,
"packets": 255,
"bytes": 17694
}
...
Since nft -j outputs JSON it can easily then be ingested back into Postgres and indexed. I personally use it together with zabbix to count per second differences in values. It needs some more work because netfilter can match packets by UID/GID only for output, input then has to be matched by destination port, 5432 in case of postgres.Systemd seems to report.network traffic stats for managed units. It spawns cgroups for units which among other things track network traffic.
I have spent some time to search for a tool that can ingest realtime network traffic data to Postgres but have no luck, so I developed this extension and used it internally in our team. Thanks Rust, pgx and libpcap, the development journey is easy and enjoyable.
Would like to hear more feedbacks. Any contributions, feature requests, bug report or ideas are welcomed.
Thanks.
I'd be super interested in tooling to help with this. I'm running – literally right now – a Python script that is writing about 100k flows per second into Postgres (TimescaleDB). It is batch rather than realtime, and it geocodes the flows on the way in.
At that point you have root and can use any of the dozen estabilished ways to do the same thing tho ?
Setup like this:
CREATE EXTENSION file_fdw;
CREATE SERVER pglog FOREIGN DATA WRAPPER file_fdw;
then do this for the easy interpretable proc files: CREATE FOREIGN TABLE IF NOT EXISTS proc_loadavg(
load1 decimal,
load5 decimal,
load15 decimal,
threads_runnable_total text,
most_recent_pid integer
) SERVER pglog OPTIONS ( filename '/proc/loadavg', header 'false',delimiter ' ' );For harder things, read the file as lines instead of fields, then create a view with some regexes to split it in fields:
CREATE FOREIGN TABLE IF NOT EXISTS proc_meminfo(
line text
) SERVER pglog OPTIONS ( filename '/proc/meminfo', header 'false',delimiter '$' );
CREATE OR REPLACE VIEW proc_meminfo_interpreted AS
WITH arr AS (SELECT regexp_split_to_array(line,':| +') a FROM proc_meminfo)
SELECT a[1] as name,a[3] as value FROM arr;
Hardest part is creating semi-legible source code in HN ;-)Postgres isn't a great TSDB - the indexes don't understand columns that will only have updates at one end, it doesn't do column compression, range queries are expensive, etc.
Perhaps it's time to just set up a time series database... Like influxDB.
If anyone's curious about TimescaleDB, it's packaged as an extension to Postgres, optimizing for performance, storage, and analysis of time series data. Implementing columnar compression algorithms is a big part of the secret sauce that makes TimescaleDB a popular choice with Postgres and SQL developers. You can read more about that on the Timescale blog (I'm Timescale's community manager btw). https://www.timescale.com/blog/search/?query=compression
If anyone's curious, the youtube channel may be a good place to start, especially this playlist https://www.youtube.com/playlist?list=PLsceB9ac9MHTtM1XWONMR...