Working Student AI Venture (Web Scraping & Data Extraction) (m/f/d)
Working Student AI Venture (Web Scraping & Data Extraction) (m/f/d)
Placed Technologies GmbH
Personaldienstleistungen und -beratung
- Verifizierte Job-Anzeige
- Berlin
- Studierende
Working Student AI Venture (Web Scraping & Data Extraction) (m/f/d)
Über diesen Job
Intro
PLACED is a young SaaS company based in Berlin-Mitte. With our AI-based platform, we are revolutionizing the world of recruitment agencies by helping them to find the perfect jobs for their candidates faster and take their sales to a new level. Our goal is clearly defined: We want to become the next unicorn in the SaaS sector! We're looking for a sharp and detail-oriented Working Student to support our data engineering team with large-scale web scraping and structured data extraction tasks. You will play a critical role in building a scalable pipeline that fetches, cleans, and structures job data from various job boards, ATS portals, and career websites. Learn and grow with us!
Tasks
- Go through job boards, career pages, and portals to collect job listing links
- Get the web page content (HTML) from each link using tools like requests, curl, or services like ScrapeOps
- Use Scrapy to build and manage web crawlers that pull out useful data from those pages
- Clean up and organize the data, and turn it into a neat JSON format
- Deal with messy web pages that have inconsistent structures or complex elements
- Upload the final cleaned data to an Amazon S3 bucket (cloud storage) in the right format
- Add error handling and logging so your code runs smoothly even when there are issues
- Work closely with the product and data teams to make sure the data is accurate and useful
Requirements
- You’re currently studying Computer Science, Data Science, or a similar field at a german university
- You’re comfortable writing scripts in Python and enjoy automating things
- You’ve worked with tools like Scrapy, requests, or BeautifulSoup to scrape websites
- You know how to read and work with HTML and JSON data
- You’ve heard of APIs and maybe even used one before
- You understand basic cloud tools like AWS S3 (or are eager to learn)
- You know the basics of good web scraping (like being polite to websites using robots.txt, waiting between requests, etc.)
- You’ve used Git to manage your code (e.g., GitHub projects or university work)
- You pay attention to detail, like solving problems, and can work on tasks by yourself
- Bonus if you’ve used tools like Playwright or Selenium or have dealt with websites that are hard to scrape (like ones that use JavaScript or CAPTCHAs)
Benefits
- Flexible working hours tailored to your university schedule
- Learn hands-on from a team building real-world automation and data pipelines
- Gain experience with web-scale data collection and cloud deployment
- Opportunity to grow into a full-time role upon graduation
Closing
Interested? Send us your CV, GitHub/portfolio, and a short note about a scraping challenge you've solved.