I used Wireshark on Windows to check that everythink is set up correctly and to see what kind of requests the phone makes.
You can use WiFi instead of bluetooth the same way. You only need to use "hotspot" option and provide DHCP to a phone and set your linux machine as a gateway. Probably you can do that with a router too, for example if you connect its WAN port to your linux machine or set up traffic redirection.
On linux I redirected traffic from phone to localhost with ports 53 (DNS), 80/443 (HTTP) and rejected any other traffic (there were some requests to time servers, that were sent by drm component of Android). I also ran a DNS server (dnsmasq) and Squid HTTP proxy that can process redirected traffic (Squid can also generate certificates to decrypt HTTPS traffic which was very useful though it took some time to find correct settings). I set up dnsmasq and squid to serve requests based on white and black lists.
After I did some tests I found another, easier way to capture traffic from Android phone. Android has a useful "Always-on VPN" feature that sends all traffic through specified host (and doesn't allow any network access until VPN connection is set up). You only need to set up ipsec on a linux box (I used strongswan). I used "Always-on VPN" feature to redirect traffic to my VPS while using mobile internet connection.
> Based on what you say, maybe you proxied Internet connections through Bluetooth - do you have a way to know whether there was any leakage?
I physically disconnected a laptop from the Internet and monitored the traffic on a bluetooth interface with Wireshark. The phone did not have a SIM card inside so it could not connect to a mobile network.
> For example, I've read, but can't confirm, that Android makes connections during bootup and before any firewall takes affect.
This can be detected using my setup. But if software is programmed to send some data only via mobile network and not via WiFi/bluetooth then it is more difficult to detect. You would need to set up a fake BTS (using OpenBTS for example) to capture that traffic. You would need special (not very expensive) SDR hardware in this case.
> A VPN with a firewall might be easier.
I ended up with the same idea. I even wrote a simple PHP app to manage black and white lists and view logs.