That’s pretty cool. How many pages do you have in your database so far? Also since you are working on a search engine, I would recommend reading this article: https://archive.org/details/search-timeline
No problem. You might consider using data from the Common Crawl to boost your index size. If you get the extracted text files (called WET instead of WARC), they don’t take up much space. I have one from 2014 that has about 73’000 pages in it, and it only takes up about 300mb uncompressed. Those files are surprisingly easy and fun to work with, and downloading them will probably always be faster than crawling on your own. If you use files from the older crawls it will probably make your product more distinctive, but there are probably a lot of 404’s so you might have to give people an option to view the cached page or go to the Wayback Machine. You probably don’t have the resources for this, but I would love it if someone made a search engine that lets you search though all 115 or so crawls that they have, which would be around 100 billion pages and take up around 816 TB.