In 1992, Steven McCanne and Van Jacobson wrote a paper called The BSD packet filter: A New architecture for user-level packet capture. This was the first time that BPF, which is also known as Berkeley Packet Filters, was talked about.
They talked about BPF architecture, how it connected to the rest of the system, and a new way to filter in this paper. If you want to read the paper you can find it here. https://www. tcpdump. org/papers/bpf-usenix93. pdf.
In the paper, they talked about a new virtual machine designed to work with register-based CPUs. Also the usage of per application buffers that could filter packets without copying all of packet information.
It’s too hard for me to understand what the paper is about, so I’m not going to read it all the way through. This article will talk about how BPF has grown and how you can write BPF programs.
In 2014. Alexei Starovoitov introduced the extended BPF implementation. BPF is an advanced VM, running in an isolated environment. It runs the piece of code that you write as a BPF program. You can consider it the same as JVM.
This is a code that you write to be loaded into the kernel. You can write it in C code and compilers that support BPF can convert it into BPF instructions.
This according to me is the most important part of this extended BPF. In order to make sure you don’t crash your kernel or put in an infinite loop, it checks your program for bugs like loops and makes sure each code path gets to the end. It keeps it safe and lets you write the BPF program without having to think too much about kernels.
There is also a “just in time” compiler in the kernel that changes the BPF bytecode to machine code after the program has been checked.
Since you are loading your program into kernel don’t think you need to restart the kernel. This is done while your system is running.
Berkeley Packet Filter (BPF) is a powerful technology that allows you to run sandboxed programs in the Linux kernel. It can be used for tracing monitoring networking, and security applications. As BPF’s capabilities grow, knowledge of it is becoming an increasingly valuable skill for Linux jobs. In this article, we provide an overview of BPF and list some of the top BPF interview questions you may encounter when applying for roles requiring Linux, networking, performance analysis, and security expertise.
What is Berkeley Packet Filter?
Berkeley Packet Filter (BPF) originated in the early 1990s as a simple packet filtering mechanism for BSD variants of UNIX. It allowed users to write small programs to inspect network packets and decide what to do with them – such as filter forward, or monitor them. BPF programs could be loaded into the kernel and run in a sandboxed environment for efficiency and security.
The Linux kernel adopted a version of BPF in the late 1990s. Since then BPF has evolved into a generalized in-kernel virtual machine that allows safe execution of user-defined programs for purposes like
- Packet processing
- Tracing
- Monitoring
- Security policies
Some key capabilities provided by today’s BPF include:
- Sandboxed execution environment
- Verification of BPF programs for safety before loading into kernel
- Access to data sources like network packets, performance counters, kernel and userspace functions
- Efficient maps for sharing data between the kernel and user space
- Helper functions for parsing packets, tracing events, etc.
- Just-in-time compilation of BPF bytecode to native machine code
- Ability to tailor BPF programs for specific tasks by choosing data sources, helpers, and maps
This combination of speed, flexibility, and safety has expanded BPF’s uses far beyond its origins in packet filtering.
Why is BPF knowledge important?
Understanding BPF is becoming an increasingly valuable skill for Linux professionals for several reasons:
- Growth of software-defined networking, containers, telemetry, and service meshes requiring efficient packet processing. BPF provides a programmable data path in the kernel for these use cases.
- Need for improved observability into Linux systems. BPF enhances capabilities for dynamic tracing, monitoring, and performance analysis.
- Desire for more flexible security policies in the kernel. BPF allows implementing security models in customized programs.
- Performance optimization of network functions and applications. BPF reduces overheads through techniques like zero-copy packet access.
- Popularity of BPF frontends like bpftrace, BCC, and XDP which make BPF capabilities more accessible.
Major companies like Facebook, Netflix, Google, Microsoft, and Cloudflare now rely on BPF for monitoring, networking, and security use cases. Fluency with BPF can make your resume stand out for Linux roles and provide insight into how systems work under the hood.
Sample BPF Interview Questions
Here are some common interview questions you may encounter related to BPF:
BPF Fundamentals
- What is BPF and what are its key capabilities?
- How is BPF different from the original Berkeley Packet Filter?
- What are some example uses of BPF today beyond packet processing?
- How does BPF provide safety when running kernel programs written by users?
BPF Architecture
- What components of the Linux kernel are involved in BPF processing?
- What is the BPF virtual machine and how does it work?
- Explain the BPF program lifecycle from writing code to execution.
- What is a BPF map and why is it useful?
- How is BPF bytecode represented and processed in the kernel?
BPF Programming
- What programming languages can be used for writing BPF programs?
- What interfaces exist for loading and interacting with BPF programs?
- How can BPF programs access data from the kernel or hardware?
- What are some key data structures like registers and maps used in BPF code?
- How can BPF programs be traced and debugged?
BPF Networking
- What networking capabilities does BPF provide?
- How can BPF programs process packets more efficiently?
- What is XDP and how does BPF improve its performance?
- How is BPF used in software defined networking and service meshes?
- What are some examples of using BPF for load balancing, firewalls, etc?
BPF Observability
- How does BPF integrate with Linux tracing frameworks like kprobes?
- What types of data can BPF programs access for observability?
- How can BPF improve monitoring and performance analysis?
- What are bpftrace, BCC, and other BPF frontend tools?
- Give examples of using BPF for monitoring disk I/O, profiling, etc.
BPF Security
- What additional security does BPF provide beyond iptables for networking?
- How can BPF programs implement security policies or mitigations?
- What are some example security use cases leveraging BPF?
- How does BPF relate to other Linux security modules like seccomp?
Having fundamental knowledge of how BPF works along with hands-on experience developing programs can help you excel at BPF interview questions. Study resources like blogs, talks, the BPF reference guide, and source code of tools built on BPF. Experiment with frontend tools like bpftrace and BCC. This will equip you to understand how BPF is applied in real-world scenarios.
At the end of the day, interviewers want to know that you have solid engineering skills and the ability to continue learning new technologies like BPF. Show your enthusiasm for Linux, networking, and monitoring – and be ready to dive into technical details or code samples during the interview. A passion for open source combined with knowledge of BPF will prepare you for success on Linux jobs.
Components of BPF Code.
Now, what all your program actually contains. Your BPF program mainly has 3 components. The first part is the execution part of the kernel code. These execution points are predefined and you can use any of these to execute your program. For example, you can put the execution point to be a particular system. In this scenario whenever that particular system call is executed your BPF program will be executed.
Second is how you will share data between kernel and user-space. This can be done by using the BPF map. With these, you can share data in both directions. Whenever you create a BPF program you can create a BPF map for data sharing.
The third is your program what it actually does. Most of the times your use cases will fall in performance or troubleshooting categories.
In short, BPF lets you run your piece of code at any point in the kernel. That code can be used to check how well the system is running, filter network packets, and do many other things.
I’ll try to write about how to write and run a BPF program in the next few posts. I’m also new to this area, so I’m trying to learn more and will keep you posted.
Introduction to BPF | LINUX Berkeley Packet Filter | CodiLime
FAQ
What would you use Berkeley packet filters for?
At which protocol layer does the Berkeley Packet Filter operate?
Does Wireshark use Berkeley Packet Filter?
What is Berkeley Packet Filter architecture in OS?
What is Berkeley Packet filtering?
Berkeley Packet Filters (BPF) provide a powerful tool for intrusion detection analysis. Use BPF filtering to quickly reduce large packet captures to a reduced set of results by filtering based on a specific type of traffic. Both admin and non-admin users can create BPF filters.
How to develop a packet filter in Linux?
I’ve just read in these answers about two options for developing packet filters in linux. The first is using iptables and netfilter, probably with NFQUEUE and libnetfilter_queue library. The second is by using BPF (Berkeley Packet Filter), that seems in a quick reading to have similar capabilities for filtering purposes.
What is Berkeley Packet Filter (eBPF)?
The result is extended Berkeley Packet Filter (eBPF) which consists of a richer assembly, more pro-gram types, maps to store key/value pairs and more components. Currently eBPF (or just BPF) is under continuous development and its capacities are evolving, although the main uses are networking and tracing.