Ringa - Massive real-estate data mining

When I worked at Ringa, a company from Sao Carlos - Brazil I was in a project that involved massive data mining of real-estate data. The goal was to create a geographical database of real-estate properties in Brazil to help the company that hired us to do some calculations faster. The project involved the use of web crawlers, complex data processing pipelines and a dashboard for data visualization.


The engineering challenges involved creating a high scale web crawler that should be modular and reusable because the scraped data from different websites, also create a data processing pipeline to only allow high quality data to be stored, and use techniques to avoid bot detection from the websites.


In the end, we were able to capture more than 1 million properties and their characteristics, price, images, etc. We used tools like python scrapy, Selenium, PostgreSQL, mongoDB and redis queues.


In case you want to check the other projects that I worked, please check my projects in the AI voice agent for outbound calls, healthcare, flights and education industry.