Machine Learning and Data Engineer
CTC Resourcing Solutions
- Basel
- Freiberuflich
- Vollzeit
- Integrate off-the-shelf open-source embedding models with the system to generate text embeddings from research publications and other text based sources.
- Design and implement the data processing pipeline to handle the conversion of PDF, XML or other files into a suitable format for text embedding.
- Set up and maintain the vector database infrastructure, ensuring efficient storage and retrieval of embeddings.
- Develop and maintain the API for semantic search, allowing for robust querying capabilities.
- Collaborate with stakeholders to gather requirements and ensure the system meets the needs of the organization.
- Conduct testing and quality assurance to ensure the reliability and accuracy of the search results.
- Document the system architecture, API usage, and operational procedures for future reference and maintenance.
- Strong programming skills, particularly in Python, and experience with machine learning libraries like TensorFlow, PyTorch
- Min 7 years experience with data engineering tasks, including data extraction, transformation, and loading (ETL)
- Familiarity with vector database technologies (e.g., FAISS, Milvus, Elasticsearch) and database indexing.
- Knowledge of API development and best practices for scalability and security.
- Ability to work independently, manage multiple priorities, and communicate effectively with both technical and non-technical stakeholders.
- English fluent