I received SRE offers from Facebook and Google without a university degree; Here is how.

I’m Amirali, and I have been working in the software industry for the past 6 years. In my senior of high school, I decided I wanted to be a software engineer working at Google/Facebook. So I googled around a bit to find out what they wanted out of candidates and found out that a degree isn’t required to get a job at these companies. Equipped with the knowledge that self-study and self-learning is an alternative path, I decided to embark on this journey, and I’m happy to announce that I’ve received offers from both of these companies on my first attempt at FAANG.

I will mainly focus on the SRE interview preparation part and share the resources I used to get offers from these two companies. The specific interviews are SRE-SE interview(System Engineer Pipeline) for Google and Production engineer interview for Facebook.

Interview rounds

FAANG divides interviews into three phases which are called recruiter, phone screen, and on-site. Screening starts from the recruiter's call.

There are, overall, eight interviews. You should expect around 15 simple questions regarding Linux, coding, and network with one or two-word answers for the recruiter call. Phone-screen has one coding and one Linux interview, and last but not least, the on-site consists of system design, coding, network, troubleshooting/Linux, and behavioral.

Google

For Google, there are two pipelines to SRE. SRE-SWE and SRE-SE, the difference is that SWE has two rounds of coding with Leetcode style questions. in contrast, SE has a round of Linux internals and a practical coding question working with files. I decided to pick the SRE-SE pipeline.

You should expect around five multiple-choice questions on Linux that are in-depth. The phone screen is a 45 minute round that covers both Linux and coding. On-site is five interviews: Non-abstract large system design or NALSD (unique to google), Googleyness or behavioral, Troubleshooting, Linux internals, and coding.

Recruiter call

For Facebook, if you have enough experience as an SRE, you should be able to answer enough questions to pass. Otherwise, brush up on network (TCP vs. UDP, TCP control bits, etc.), Linux (Process states, typical commands, etc.), and coding (know your Big O, you should be pretty comfortable with time/space complexity analysis).

For Google, you need to be pretty comfortable with Linux before taking the call, so look at the Linux preparation section.

resources:

Phone Screen

you should expect file operations and easy/medium Leetcode questions. The most common data structures/algorithms tested are arrays, hashes, binary search, sorting, and heaps (how and where to use them). Graphs, Trees, and backtracking are unlikely. Dynamic programming is entirely out of the picture.

Remember to think of edge cases and ask clarifying questions. For example, what is the file’s encoding? Should I handle parsing issues such as X and Y? what is the typical file size?

resources:

for the Linux part, you should have a pretty decent understanding of Linux internals (do NOT memorize). For each topic, you should know why it is implemented this way and how it is used.

For example, why are we using virtual memory instead of physical? How is virtual memory translated to physical memory?

There are the topics that you should cover:

  • Virtual memory (Paging, demand paging, anonymous vs. file-backed memory, shared memory, page faults, dirty pages, page cache, swapping, memory mapping, memory protection, memory layout, overcommit, TLB, MMU, OOM, PSI)
  • Signals (Know key signals such as SIGTERM, SIGSTP, SIGCHLD, SIGKILL, SIGSEG, signal handlers, signal masking, default handlers, tracing signals)
  • Processes (Exec/Fork, Zombie/Orphan, Interruptible and uninterruptible Sleep, Runqueue/Scheduler latency, Completely fair scheduler and other scheduler policies, Preemption, Context switching, CPU registers and caches, userspace threads/lightweight threads/coroutines,)
  • Interprocess communication and Locking (Advantages / Disadvantages of each approach, a rough idea of how it’s implemented, what are the system calls)
  • Networking stack (which part is in the kernel, which part is handled by userspace libraries, common syscalls, sockfs)
  • Control groups (what it is, how it works) and namespaces (unlikely but good to cover)
  • System calls (you should know the 12 key system calls, how they are initiated, CPU protection rings, mode switch, userspace vs. kernel space)
  • Tracing (strace, ltrace, ptrace, perf, user-space tracing, kernel tracing)
  • Virtual file system or VFS (pseudo-file-systems such as proc, sockfs, pipefs and how shared memory integrates with VFS, file descriptors, open file descriptions table, inodes, NFS, LVM, software RAID)
  • Linux boot process (unlikely but good to cover, rather high-level. BIOS -> MBR -> grub -> kernel -> init -> userspace)
  • Main responsibilities of the kernel and init (remember to ask the questions, e.g., why do we need a kernel?)
  • Interrupts (what events cause interrupts, interrupt context vs. thread context, how interrupts are executed, top half and bottom half, and a bit about interrupt masking)
  • Common Linux tools (iostat, vmstat, top, pidstat, uname, touch, rm, cd, kill, iotop, mount, df, du, lsof, etc. you should know when they are used, how they are used, and what is the output)

Also, you should avoid getting too deep into the internals. Knowing what kinds of data structures are used or how a particular feature is implemented is not important. You are not expected to contribute code to the kernel, write drivers, or kernel modules.

resources:

On-site

When you reach on-site, the recruiter will provide you with a document that covers the topics of the interviews and what you should expect. Treat it just as another source of info. The document isn’t going to be comprehensive at all.

These are the same as phone screens, but a tad bit harder and more in-depth.

Google, unfortunately, doesn’t ask any network-related questions, and This is just for Facebook. This interview has a bit less weight than other interviews so if you are tight on time, put more effort into other interviews.

The interview is pretty easy and simple. You are meant to be leading the interview, and questions are quite a bit open-ended to cover what you know. expect things like what happens when you press facebook.com in the browser? (i talked non-stop for roughly 30 minutes on this question). You should at least cover the basics, which are DNS, TCP, and HTTP. But you can talk about DHCP, SLAAC, IPv4, IPv6, BGP, OSPF, iBGP, NAT, QUIC, UDP, ICMP, and on and on.

Also, you should know at least one network protocol in depth. (e.g., TCP, HTTP) and a bit about troubleshooting common network problems and it’s tools (ping, mtr, traceroute, arp, IP, route, netstat)

resources:

I didn’t spend much time on behavioral preparation, but I can give you some tips. Facebook and Google behavioral interviews are completely different things.

At Facebook, the interviewer goes through your past experiences and asks basic questions such as why Facebook, why production engineering, and stuff like that. It would help if you went through the example questions your recruiter provides and try to think of past experiences beforehand. It is meant to make sure you are a decent, functioning person.

At Google, the interviewer asks mainly hypothetical questions, which are quite abstract and ambiguous. You are expected to ask clarifying questions to come up with a reasonable answer. My suggestion is to brush up on agile (focus on the concept, not the names, such as breaking up a project into small deliverable sub-tasks) and don’t get hung up on the specifics. Also, get familiar with OKRs and take a look at rework. You can find most of the questions asked online as well.

This is just for Facebook. Google does a non-abstract version which is different from a typical system design interview.

The system design for the Production engineering role is focused on infra tools similar to Kubernetes, Jenkins, BitTorrent, etc., so my suggestion is to study how these systems are implemented to draw inspiration for your own design.

The most important things in the interview are:

  • Ask clarifying questions (how many queries? what is the data size? how many deploys? etc.)
  • Think about edge cases and what’s going to happen when things fail. concurrency issues, power outages …
  • Make sure your system doesn’t have a single point of failure, and it’s scalable.

There are many examples available on system design interviews and how they work; however, they are mainly for SWE positions and don’t focus on infra tools; the nature of the interviews is the same, so they are somewhat useful.

You should know some basics: etcd/zookeeper, distributed locking, concurrency issues, isolation levels, queues, S3. for the practice, you can design a job scheduler.

resources:

  • Grokking the system design interview by Educative
  • Designing Data-intensive applications by Martin Kleppmann
  • Search system design in youtube, lots of valuable videos.

This interview is unique for google, and there are limited resources available for it. The design is pretty low-level, and you are expected to come up with a Bill of materials.

Same as any other system design interview, you are given a vague question. You should ask clarifying questions and come up with numbers regarding storage/network/IOPS, then come up with an estimate as to how many machines are required and the main bottleneck of the system.

To be truly successful, you should know a simple sharding strategy and how the assignment is done, SSTABLES and Memtable, Write ahead logging, and SLOs (what kind of SLI are appropriate for each type of system).

resources:

https://sre.google/workbook/non-abstract-design/

https://www.educative.io/courses/grokking-adv-system-design-intvw
https://sre.google/classroom/

https://www.youtube.com/watch?v=swfurPw8c6A

The troubleshooting interviews at Google and Facebook are completely different. Facebook focuses on practical open-ended issues that you probably encounter during your day-to-day tasks, such as latency problems (e.g., a database is running slow). Google comes up with weird, specific scenarios that literally never happens (interviewer himself agreed). So I will break it up into two parts.

Facebook:

The interview is like Dungeons and Dragons, you query the interviewer, and he provides you with an answer (for example, you say I’m going to run top, then he tells you that you see high Load but low CPU utilization). he is very interested in your thought process and how you go about troubleshooting and less about solving the problem itself.

The most useful resource for tackling real production problems is Brendan Greggs work. So watch his youtube videos, read his book, and his blog. Also, go through the topics provided by your recruiter. Try to come up with some problems yourself and try to debug those. For example, you could have a latency problem caused by a noisy neighbor that resides in the same rack and takes the whole bandwidth.

Google:

The format of the interview is still querying the interviewer and getting a response. Still, most of the time, they will provide you with a command output instead of telling you the most important piece of information.

To be honest, I still don’t know how you can prepare for this interview. My suggestion is to talk to a google SRE about this type of interview to find out more beforehand. While Facebook questions were open-ended, google asked multiple specific questions which either you could answer or couldn’t without much room to explore.

Also, Brendan's work was basically useless for this interview as none of the questions covered practical troubleshooting scenarios. For example, they could give you a question where the system doesn’t boot because some specific file isn’t in the right format!

Final Notes

I believe it’s crucial to get a feel of what you should expect during the interview and what kinds of signals you need to give. So I strongly recommend scheduling mock interviews with peers or, much more preferably, actual google or Facebook employees.

The interview prep for me took roughly 3 months of putting 30–40 hours a week. I decided to join Facebook as a Production Engineer at level 4.

Overall resources:

https://leetcode.com/discuss/interview-experience/707265/Facebook-Apple-Amazon-or-Production-Engineer-EE-SRE-SysDE-or-London-or-May-2020-Offer

https://interviewthoughts.quora.com/My-Site-Reliability-Engineer-Interview-with-Google-Dublin

https://fabrizio2210.medium.com/how-i-get-a-job-at-google-as-sre-83d44aef7859

https://pramp.com