Big Data and distributed computing is at Adalta's core. We support our partners in the commissioning, maintenance and optimization of some of France's largest clusters. Adaltas is also an advocate and active contributor to Open Source with our latest focus as a new one Hadoop distribution that is completely open source. This project is TOSIT Data Platform (TDP).
During this internship you will join the TDP project team and contribute to the development of the project. You will deploy and test production-ready Hadoop TDP clusters, you will contribute code in the form of iterative improvements to the existing codebase, you will contribute your knowledge of TDP in the form of customer-ready support resources, and you will gain experience using of core Hadoop components which HDFS, YARN, Ranger, Spark, Beehiveand Zookeeper.
This will be a serious challenge, with a large number of new technologies and development methods for you to tackle from day one. In return for your commitment, you will finish your internship fully equipped to take on a role in the Big Data domain.
Adaltas specializes in Big Data, Open Source and DevOps. We work both on site and in the cloud. We are proud of our Open Source Culture and our contributions have helped users and businesses worldwide. Adaltas is based on an open culture. Our articles share our knowledge about Big Data, DevOps and several complementary topics.
Developing the TDP platform requires an understanding of Hadoop's distributed computing model and how its core components (HDFS, YARN, etc.) work together to solve Big Data problems. A working knowledge of using Linux and the command line is required.
During the internship you will learn:
- Hadoop cluster management
- Hadoop cluster security included Kerberos and SSL/TLS certificate
- High availability (HA) of services
- Scalability in Hadoop clusters
- Monitoring and health assessment of services and jobs
- Fault-tolerant Hadoop cluster with recoverability of lost data in case of infrastructure failure
- Infrastructure as Code (IaC) via DevOps tools such as Ansible and Drifter
- Code collaboration with Git in both Gitlab and Github
- Familiarize yourself with the TDP distribution architecture and configuration methods
- Deploy and test secure and fault-tolerant TDP clusters
- Contribute to the TDP knowledge base with troubleshooting guides, FAQs, and articles
- Participate in the debates on the TDP project goals and roadmap strategies
- Actively contribute ideas and code to make iterative improvements to the TDP ecosystem
- Research and analyze the differences between the major Hadoop distributions
- Location: Boulogne Billancourt, France
- Language: French or English
- Start date: March 2022
- Duration: 6 months
Much of the digital world runs on open source software and the Big Data industry is booming. This internship is an opportunity to gain valuable experience in both domains. TDP is now the only true open source Hadoop distribution. This is the right time to join us. As part of the TDP team, you will have the opportunity to learn one of the most important models of big data processing and participate in the development and future roadmap of TDP. We believe that this is an exciting opportunity and that after completing your internship you will be ready for a successful career in Big Data.
A laptop with the following features:
- 32 GB of RAM
- 1TB SSD
- 8c/16h CPU
A cluster consisting of:
- 3x 28c/56t Intel Xeon Scalable Gold 6132
- 3x 192TB RAM DDR4 ECC 2666MHz
- 3x 14 SSD 480GB SATA Intel S4500 6Gbps
A Kubernetes cluster and a Hadoop cluster.
- Salary €1200/month
- Restaurant tickets
- Transport card
- Participation in an international conference
For any request for further information and to submit your application, please contact David Worms:
#Internship #Big #Data #infrastructure #TDP