At Santiment we collect data from a variety of sources, crunch it and provide insightful metrics for investors. To achieve this, we built a robust infrastructure involving hundreds of GBs of RAM and dozens of CPUs. The total data pool that we use to perform our computations is growing quickly, with the current watermark > 2TB. We are looking for big data engineers to help us grow our set of metrics over the data that we collect. If you are interested in financial metrics, graph theory, NLP analysis and applying all this on an enormous set of data - we’ve got a variety of tantalizing challenges to offer. As a big data engineer, you will be analyzing blockchain and social data sets and developing metrics on top of them, such as topic detection, sentiment analysis, flow of funds and classification of addresses. We have a wide set of challenges and are actively exploring different approaches, so your work will involve a lot of research.
You also gain experience in how different blockchains work and how they can be efficiently analyzed. We are working on developing new financial metrics to measure the adoption of these new monetary systems.Skills Required
What we’re looking for
- Experience with Spark, Hadoop and/or Flink
- Experience with working with large datasets (100GB+)
- Experience with Scala/Java and Python
- Experience with using SQL for data exploration and analysis
What we offer
- Pre-established big data pipeline with established CI/CD process, which simplifies the development of new metrics.
- Remote team. You can work from anywhere you want. Most of the team is based in Europe. If you’d like to work with the team, we have offices in Sofia, Frankfurt, Zurich, Minsk, and Belgrade. We have standups twice a week and communicate using Discord.
- Cutting-edge technology stack, using highly scalable infrastructure, CI/CD processes and a true microservice architecture.
- Opportunity to work on the latest blockchain technologies and dive deep into them. The better we understand each block chain, the better metrics we can develop.
- Opportunity to work on open source. Some of the current open source libraries that we maintain include:
- https://github.com/santiment/sanpy - a python library for accessing our data with pandas
- https://github.com/santiment/san-exporter - a Node.js library for feeding data to our big data pipelines
- Opportunity to contribute to our company blog with results from our research.
- Competitive compensation and SAN token rewards.