About me

I’m a University of Bristol Graduate in Computer Science & Electronics with a deep interest in Computer Architecture, HPC, DSP, and Scientific and low-level Programming. In my free time, I like reading about CPU/GPU specifications sheets and learn how they work, where my favourite project is simulator of an out-of-order non-blocking CPU. In my career, I have a experience in low-level/real-time programming, working on embedded devices such as NVIDIA Jetson, Texus Instruments C6678 DSP, and various different FPGAs. Currently, I’m working for Graphcore as a graduate member of the silicon team, rotating into physical design, verification, and RTL teams.


Member of the Silicon Team

Sep 2019 - Present
Graphcore Limited, Bristol UK

Graduate Role, currently in Physical Design team rotating into Verifcation and RTL in future.

Embedded Software Engineer

July 2019 - Aug 2019
QLM Technology LTD, Bristol UK

Contract summer job after graduating and before starting my career at Graphcore. During my time at QLM I have accomplished the following

  • Optimized existing Matlab code 80x speed up by rewriting it in Julia combined with calls to C functions and utilizing parallel programming.
  • Created tools and library for the host machine to communicate with a FPGA.
  • Provided better work flow for the company with version control using Git, writing documentations and using open source languages and tools.

Research Engineer (Machine Learning)

Jan 2018 - Mar 2019
Toshiba Research Europe Limited, Bristol UK

From Jan 2018 to Sep 2018 I was working fulltime at Toshiba as Research Engineer in Machine learning. Afterwards I was working part time while finishing my masters degree. During my time there I have accomplished the following.

  • Invented and filed a patent of a system that reduces network traffic by 60% of Distributed Neural Network by using Reinforcement Learning.
  • Used machine learning library such as Tensorflow on Embedded devices like NVIDIA Jetson to simulate IoT networks.
  • Produced demos using JavaScript and ReactJS so it can be used for general meetings and conferences.


Superscaler CPU Simulator - A CPU simulator that is capable of Out-of-order execution, speculative execution and converting assembly code to machine code.
Disparity Algorithm Optimisation for TI C6678 DSP - Collaborative University Project on optimisng disparity algorithm C code. We were able to make the code run 20 times faster by reducing loops and adding pragma optimisation, while preventing race condition. We also wrote assembly code to compare the performacne, which made our project one of the highest marked in the unit.
Jacobi Iterative Method OpenMP Optimisation - Optimising Jacobi Iterative Method by parallelising subtask using OpenMP and SIMD while ensuring NUMA awareness and preventing race condition.
Musication - Collaborative project. A Web app for creating and streaming mappings of music to location using only AWS Lambda. I was working on making the interactive map in React and linking it to DynamoDB for storing music mappings.


  • System and Method For Distributed Learning
  • Abdussalam Elturki and Aftab Khan
    US Patent US20200034747A1

    Skills & Proficiency