Singapore, Singapore, Singapore

Web Scraping and Senior Data Acquisition Engineer

 Job Description:

Our client is an AI-powered research platform. They organize unstructured information in crypto, making it accessible to investors and researchers

Responsibilities

  • Work closely with co-founding team to define priorities and develop information sourcing roadmaps
  • Lead the effort to design and implement the architecture of a large-scale crawling system (100+ crawlers)
  • Design, implement, and maintain various components of data acquisition infrastructure (building new crawlers, maintaining existing crawlers, data cleaners & loaders)
  • Build pragmatic, scalable, and statistically rigorous solutions to large-scale web and data infrastructure problems by leveraging or developing statistical and machine learning methodologies
  • Effectively advocate technical solutions to research, engineering teams and business audiences

Requirements

  • Bachelors degree in quantitative field (e.g. Computer Science, Engineering, Mathematics, Statistics, Operations Research or other related field)
  • 3+ years of experience with Python for data wrangling and cleaning
  • Expertise in running, monitoring and maintaining all aspects of a scraping pipeline end to end (building and maintaining 100+ spiders, avoiding bot prevention techniques, data cleaning and pipelining); familiarity with scraping libraries and monitoring tools highly recommended (BeautifulSoup, Xpaths, Selenium, Puppeteer, Splash)
  • Experience in extracting data from multiple disparate sources including HTML, XML, REST, GraphQL, PDF, and spreadsheets
  • Experience in using techniques to protect web scrapers against bot detection, site ban, IP leak, browser crash, CAPTCHA and proxy failure
  • OOP, SQL and Django ORM basics

What's next?

If you're interested in this role, click Apply To Position or drop an email to don@adstifysearch.com


Don Chan
Senior Consultant
EA Personnel Number: R1763146
EA License Number: 20C0292

  Required Skills:

REST Prevention OOP Django Spreadsheets Selenium Sourcing Machine Learning Components Statistics Architecture Infrastructure XML Mathematics Computer Science Python Email Research HTML SQL Engineering Design Business Science

 Salary Package:

$ 75,000.00 - 150,000.00 (US Dollar)