Software Engineer - Data Engineering & ML
New Today
3 months agoFull TimeBay Area, CAWashingtonWest Coast, USAHybrid - Open to remote140K - 180K/yearApplyLocation: Bay Area, CA or Washington; Bay Area Preferred, Remote optionTerm: Full-Time; PermanentRhizome is seeking a Software Engineer who can scale Data Engineering in support of Machine Learning Development at Utility Scale. The ideal candidate will have a strong background in Data Processing Pipelines, DAGs, ETLs, Feature extraction, and Statistical Analytics using Python and AWS cloud.The ideal candidate will have deep expertise in working with GIS data, Relational Databases, CSVs, and Excel at Utility scale. Successful candidates will also have practical experience building large scale ETL pipelines on AWS or GCP for Data Engineering, Feature Extraction, Statistical Analysis and Correlations.About RhizomeRhizome is at the forefront of developing decision intelligence technology at the intersection of climate science and infrastructure systems. Our team pursues this endeavor with the wisdom and steadiness of industry veterans, and the curiosity, grit, and energy of startup and technology enthusiasts.Our climate resilience SaaS platform helps utilities, governments, and industries plan for greater resilience to climate change and extreme weather by applying AI to a vast amount of information that characterizes infrastructure assets and their vulnerability to extreme weather. Focused on the $500B resilience investment gap in the grid today, our mandate is simple: Help electric utilities proactively adapt to climate change by integrating cutting-edge climate-asset intelligence into their existing planning workflows. As the world experiences record-breaking climate-related impacts, especially related to grid failures, our platform identifies future extreme weather vulnerabilities on utility assets at high resolutions and empowers planners to optimize investment deployments that keep society safe during natural hazard events.Roles and ResponsibilitiesDesign, construct, and maintain data pipelines to combine large volumes of geospatial, climate + weather, and electric utility datasetsWork with cross-functional team to deliver data in support of analytic and ML pipelinesDevelop deep familiarity with electric utility datasets and take ownership of integration of new datasets into our existing environmentsContribute to ML model development in context of reliability and resiliency for the gridOptimize storage and ETL pipelinesDevelop versioned, scalable, repeatable and reliable pipelines for utility data that is in GIS and Tabular format to Delta Lake formatScale & Automate data pipelines for statistical analysis for internal and external use-casesExceptional ability to diagnose data issues and discrepanciesAbility to modularize different stages of data ingestion and verificationAbility to write algorithms for data sanity checks and classification of different data elementsAbility to develop heuristics and suggestions for missing data itemsAbility to validate and unit test data pipelinesQualificationsExceptional Python programming skillsExperience and expertise in working with Earth Observation and other geospatial data, at the gigabyte-to-terabyte scaleStrong programming skills with NumPy, SciPy, XarraysStrong programming skills with frameworks like Dagster or Airflow or PrefectStrong programming skills with Databricks or Apache Spark or Amazon EMR or ClouderaDeep expertise in storage optimization and partitioning on RDS, Postgres, PostGIS, Delta LakeHands on with GIS dataset and QGIS or ESRIHands on Experience with multi-dimensional Climate or Weather dataFamiliarity or hands on experience with Secure Cloud DevelopmentExceptional ability to diagnose data issues and discrepanciesAbility to modularize different stages of data ingestion and verificationAbility to write algorithms for data sanity checks and classification of different data elementsAbility to develop heuristics and suggestions for missing data itemsAbility to validate and test pipelines and write functional test to validate the pipelinesWe’ll pay extra close attention if you have:Exposure or experience with Electric Utility Tech StackExposure to applied ML and Data Engineering with Electric Utility backgroundExperience in early-stage startup environmentsCulture and Core ValuesAt Rhizome, we lead with compassion and empathy, aiming to understand before we help. Our thesis as technologists is that, in order to fulfill our mission to protect society from the impacts of climate change through intentional, intelligent infrastructure planning, we need to embark on a journey of respectfully listening, learning, and then problem-solving. This sentiment is represented through our core values:Empathy: Understanding and relating to problems, customers, and each other, with humility.Creativity: Exploring with curiosity and building with intention.Aspiration: Striving for societal impact, personal fulfillment, and simply doing good work.Tenacity: Pushing past barriers and the status quo with a sense of optimism and determination.Service Excellence: Delivering high-quality outcomes for our customers, colleagues, and communities.Compensation and BenefitsRhizome offers competitive salaries and an excellent package of benefits and stock options. Compensation is based on a variety of factors including experience, role, and location.Rhizome DataA changing climate demands Resilience by DesignWe like solving hard problems with creativity, tenacity, and empathy for our customers. At the same time, we believe that being better stewards in our community, building lasting relationships, and connecting dots is critical to affecting long-lasting change. AI is what we build, and resilience is what we serve. We've assembled a team that you can count on, because at the end of the day, if the grid can be 99.9% reliable, why can't we?
#J-18808-Ljbffr
- Location:
- San Francisco, CA, United States
- Job Type:
- FullTime