Deep Learning SW Development Engineer - Training Libraries
Location: On-Site, Austin, TX

Job Description:

At AMD, we don’t just create technology; we create experiences that change the world. From gaming to artificial intelligence to the data center, we’re powering the next generation of computing. Whether it’s driving innovations in GPUs, AI, or cutting-edge hardware, everything we do is driven by a singular mission: to transform lives through technology.

And we need YOU to be part of this revolution. If you're an engineer with a passion for performance optimization, particularly in the world of deep learning, come join our dynamic team and build the future of AI.


What You’ll Be Doing:

As a Deep Learning Software Development Engineer at AMD, you'll be an integral part of our team of industry experts. You’ll work on improving the performance of key deep learning training libraries on AMD GPUs, targeting the most exciting applications and benchmarks of the future.

Your responsibilities will include:

  • Optimizing Deep Learning Libraries: Fine-tune open-source deep learning libraries such as Megatron and Transformer Engine to boost performance on AMD GPUs, helping our customers get the most out of their hardware.

  • High-Performance Computing: Analyze and optimize key deep learning models, focusing on both multi-GPU (scale-up) and multi-node (scale-out) environments. You'll ensure that our solutions excel in the high-performance world of distributed computing.

  • Pioneering New Technologies: Contribute to the development of groundbreaking new hardware (ASICs and GPUs), bringing fresh approaches to deep learning and AI workloads.

  • Data-Driven Optimization: Take a data-driven approach to optimization, applying software engineering best practices to create innovative solutions that push AMD’s capabilities forward.

  • Problem-Solving & Debugging: Work through complex issues, finding and researching more efficient methods to meet performance objectives. You’ll debug and resolve problems while continually innovating and improving.

  • Collaboration: Partner with internal GPU library teams and work closely with technical peers to fine-tune and optimize deep learning training.


What We’re Looking For:

We’re looking for someone who is innovative, passionate, and a team player. If you love solving complex problems and working at the cutting edge of deep learning technology, this role is for you.

  • Programming Expertise: You’re proficient in C/C++ and Python. Your object-oriented programming skills shine, and you’re passionate about writing high-quality code.

  • Deep Learning Knowledge: Familiarity with deep learning frameworks and optimization techniques, including experience with GPU computing (HIP, CUDA, OpenCL) is essential. You’ll be comfortable working on performance-driven AI workloads.

  • Concurrency and Performance Optimization: You know the ins and outs of concurrent programming, threading APIs, and floating-point operations. You’ll apply your understanding of precision and accuracy to drive performance improvements.

  • Software Development Best Practices: You’re well-versed in using source control systems (like GitHub), CI/CD, and tools like debuggers and profilers (on Linux). You understand the importance of strong software development processes.

  • Communication Skills: You can effectively communicate technical challenges and solutions, both in writing and in presentations, ensuring that teams across AMD are aligned and informed.


Preferred Experience:

  • Deep Learning Workloads: Experience optimizing performance for deep learning models (bonus points for analyzing throughput optimization).

  • Numerical Expertise: A solid understanding of the numerics of floating-point operations, and how they impact the performance of deep learning models.


Academic Background:

  • Bachelor’s or Master’s in Computer Science, Computer Engineering, Electrical Engineering, or equivalent.

  • Advanced degrees (Master’s or PhD) and professional experience are a big plus.


Why AMD?

At AMD, we don’t just care about the work we do – we care about how we do it. Our culture is built on innovation, execution excellence, and diversity of thought. We push the boundaries of technology every day to make the impossible, possible.

Here’s what you’ll get by joining us:

  • Competitive Salary & Incentives: You’ll receive a competitive salary, and depending on your role, bonuses or sales incentives. Plus, if you're eligible, you could participate in our Employee Stock Purchase Plan.

  • Comprehensive Benefits: Access to an extensive range of benefits, including healthcare, dental, and vision insurance, as well as opportunities for professional growth and development.

  • A Culture of Inclusion: At AMD, diverse perspectives are what make us strong. We foster an environment where everyone is heard and valued, and innovation thrives when different ideas come together.

  • Exciting Projects: Work on cutting-edge deep learning technologies, collaborate with brilliant minds, and contribute to the next generation of AI and GPU innovations.


Ready to Make an Impact?

AMD is where your expertise can truly shine and help shape the future of computing. If you’re a passionate software engineer looking to push the limits of deep learning and GPU performance, we’d love to hear from you.

Apply Now and join the team that’s helping to advance the next era of computing!


Key Skills:

  • Deep Learning SW Development Engineer - Training Libraries