Backend Engineering Interview Questions for 2026

Backend engineering interviews test your ability to design, build, and scale server-side systems. Beyond writing code, interviewers want to know that you can make architectural decisions about APIs, choose between microservices and monoliths, implement secure authentication, design effective caching strategies, and leverage message queues for reliability. This guide covers 20 essential questions with detailed answers that are frequently asked at top tech companies.

Practice Backend Questions Free → Browse All Topics

Essential Questions

Topic Areas

APIs

REST, GraphQL, gRPC

Free

Full Practice Access

Jump to Topic

REST vs GraphQL (3) Microservices vs Monoliths (4) Authentication (4) Caching (3) Message Queues (3) API Design (3)

REST vs GraphQL Questions

What are the key differences between REST and GraphQL?

REST uses multiple endpoints with fixed data shapes (GET /users, GET /users/1/posts). GraphQL uses a single endpoint where clients specify exactly what data they need in a query. REST can over-fetch or under-fetch data; GraphQL solves this but adds complexity with query parsing, schema management, and N+1 query problems. REST is simpler for CRUD; GraphQL excels with complex, nested data and multiple client types.

How do you handle versioning in a REST API?

Common strategies: URL path versioning (/v1/users, /v2/users) is explicit and easy to understand. Header versioning (Accept: application/vnd.api.v2+json) keeps URLs clean. Query parameter versioning (?version=2) is simple but less RESTful. The best approach depends on your consumers: public APIs benefit from URL versioning for clarity; internal APIs may use header versioning for flexibility.

What is the N+1 query problem in GraphQL and how do you solve it?

The N+1 problem occurs when a query fetches a list of items (1 query) and then makes individual queries for each item's related data (N queries). In GraphQL, this happens because resolvers execute independently. Solutions include DataLoader (batches and caches database calls within a single request), query lookahead to optimize SQL joins, and using persistent query patterns.

Microservices vs Monoliths Questions

When should you choose microservices over a monolith?

Start with a monolith and extract microservices when you have clear domain boundaries, teams that need to deploy independently, services with different scaling requirements, or components needing different technology stacks. Microservices add significant operational complexity (networking, monitoring, distributed tracing, data consistency). Most startups should start monolithic and decompose after reaching product-market fit.

How do microservices communicate with each other?

Synchronous: REST or gRPC for request-response patterns. gRPC is faster (Protocol Buffers, HTTP/2, streaming) and better for internal communication. Asynchronous: message queues (Kafka, RabbitMQ) for event-driven communication, decoupling services and handling traffic spikes. Choose synchronous when you need an immediate response; asynchronous when the caller does not need to wait or when building event-driven architectures.

What is the saga pattern and when do you need it?

The saga pattern manages distributed transactions across microservices where traditional ACID transactions are impossible. Each service executes its local transaction and publishes an event. If any step fails, compensating transactions undo previous steps. Two types: choreography (services react to events, simpler but harder to track) and orchestration (a central coordinator manages the flow, easier to understand and debug).

What is an API gateway and what problems does it solve?

An API gateway is a single entry point for client requests that routes to appropriate microservices. It handles cross-cutting concerns: authentication, rate limiting, request/response transformation, SSL termination, load balancing, circuit breaking, and API composition. Examples include Kong, AWS API Gateway, and NGINX. Without a gateway, clients must know about every service and handle these concerns individually.

Authentication Questions

Explain the difference between authentication and authorization.

Authentication verifies who you are (identity). Authorization determines what you can do (permissions). Authentication happens first: login with credentials, receive a token. Authorization happens on each request: check if the authenticated user has permission for the requested action. Common models: RBAC (role-based), ABAC (attribute-based), and ACL (access control lists).

How does OAuth 2.0 work and what are the common grant types?

OAuth 2.0 delegates authorization without sharing credentials. The Authorization Code grant (with PKCE) is the standard for web and mobile apps: the client redirects to the auth server, the user authenticates, receives an authorization code, and exchanges it for tokens. Client Credentials grant is for machine-to-machine. Implicit grant is deprecated. Always use PKCE with public clients to prevent code interception attacks.

#10

What is a JWT, how is it structured, and what are the security considerations?

A JWT (JSON Web Token) has three Base64-encoded parts: header (algorithm, token type), payload (claims like user ID, expiration, roles), and signature (verifies integrity). JWTs are stateless (no server-side storage) but cannot be revoked until they expire. Security considerations: use short expiration times, store in httpOnly cookies (not localStorage), validate all claims, use RS256 over HS256 for distributed systems, and implement refresh token rotation.

#11

Compare session-based authentication with token-based authentication.

Session-based: server stores session data, sends a session ID cookie. Easy to revoke (delete from store) but requires sticky sessions or shared storage (Redis) across servers. Token-based (JWT): stateless, no server storage, works across domains and services. Cannot be easily revoked and increases payload size. Use sessions for traditional web apps; tokens for SPAs, mobile apps, and microservices.

Caching Questions

#12

What are the common caching strategies and when do you use each?

Cache-aside (lazy loading): application checks cache first, loads from DB on miss, writes to cache. Write-through: writes to cache and DB simultaneously (strong consistency, slower writes). Write-behind (write-back): writes to cache immediately, asynchronously syncs to DB (fast writes, risk of data loss). Read-through: cache loads from DB on miss (simpler application code). Choose based on your consistency requirements and read/write ratio.

#13

What is cache invalidation and why is it considered hard?

Cache invalidation removes or updates stale cached data when the source data changes. It is hard because: you must track all cached copies across multiple layers (application, CDN, browser), race conditions can cause stale data, and invalidation at scale introduces latency. Strategies include TTL-based expiration, event-driven invalidation, and versioned keys.

#14

How does a CDN work and when should you use one?

A CDN (Content Delivery Network) caches content at edge servers geographically close to users, reducing latency and origin server load. CDNs handle static assets (images, CSS, JS), but modern CDNs also cache API responses and run edge compute (Cloudflare Workers, Lambda@Edge). Use a CDN when: you serve users globally, have high traffic, or need DDoS protection.

Message Queues Questions

#15

What is a message queue and when should you use one?

A message queue decouples producers (senders) from consumers (receivers), enabling asynchronous processing. Use message queues for: background job processing (email sending, image resizing), smoothing traffic spikes, event-driven architectures, and cross-service communication in microservices. Message queues improve resilience and scalability.

#16

Compare Kafka and RabbitMQ. When would you choose each?

Kafka is a distributed log: messages are persisted, ordered within partitions, and can be replayed. It excels at high throughput, event streaming, and event sourcing. RabbitMQ is a traditional message broker: supports complex routing (exchanges, bindings), message acknowledgment, and priority queues. Choose Kafka for event streaming, log aggregation, and high-throughput scenarios. Choose RabbitMQ for task queues and complex routing.

#17

How do you ensure exactly-once message processing?

True exactly-once is extremely difficult in distributed systems. Practical approaches: at-least-once delivery with idempotent consumers (use idempotency keys to detect and skip duplicate processing), Kafka transactions for exactly-once within Kafka, and the outbox pattern (write events to an outbox table in the same DB transaction, then publish asynchronously). Design consumers to be idempotent: processing the same message twice should produce the same result.

API Design Questions

#18

What are the best practices for designing a REST API?

Use nouns for resources (/users, /orders), HTTP methods for actions (GET, POST, PUT, PATCH, DELETE). Return appropriate status codes (201 Created, 404 Not Found, 422 Unprocessable Entity). Support pagination, filtering, and sorting. Use consistent error response format. Version your API. Implement rate limiting and return rate limit headers. Document with OpenAPI/Swagger. Design for idempotency.

#19

How do you handle pagination in APIs with large datasets?

Offset-based (?page=3&limit=20): simple but slow for deep pages (database must skip rows) and inconsistent with concurrent inserts/deletes. Cursor-based (?cursor=abc123&limit=20): uses an opaque cursor (encoded ID or timestamp) for the next page. Consistent and performant regardless of page depth. Use cursor-based for feeds and large datasets; offset-based only for small, static datasets.

#20

What is rate limiting and how do you implement it?

Rate limiting restricts the number of requests a client can make in a time window, protecting against abuse and ensuring fair resource usage. Algorithms: fixed window (simple, burst-prone), sliding window (smooth, more accurate), token bucket (allows controlled bursts), leaky bucket (constant rate). Implement with Redis (INCR with TTL for fixed window, sorted sets for sliding window). Return 429 Too Many Requests with Retry-After and rate limit headers.

How to Prepare for Backend Engineering Interviews

1. Build Real Systems

The best preparation is building real backend systems. Create a REST API with authentication, implement caching with Redis, set up a message queue for background processing, and deploy with Docker. When you have built these systems yourself, interview questions become descriptions of things you have already solved.

2. Understand Trade-offs Deeply

Backend interviews are fundamentally about trade-offs. REST vs. GraphQL, SQL vs. NoSQL, synchronous vs. asynchronous, consistency vs. availability — for every decision, be prepared to explain the pros, cons, and specific use cases. Avoid absolute statements like "always use microservices".

3. Know Your Framework Inside Out

Whether you use Spring Boot, Node.js/Express, Django, or Go, know your framework deeply. Understand its request lifecycle, middleware/filter chain, dependency injection, error handling, and testing patterns.

4. Study Distributed Systems Fundamentals

Backend engineering at scale is distributed systems engineering. Understand the CAP theorem, consensus protocols (Raft, Paxos), distributed transactions, eventual consistency, vector clocks, and leader election.

5. Practice Explaining Your Architecture

Many backend interviews include a design component where you draw and explain an architecture. Practice explaining your designs out loud: why you chose specific components, how data flows through the system, and how you would handle 10x growth.

Ready to Ace Your Backend Engineering Interview?

Practice with interactive lessons, quizzes, and a Feynman practice mode to explain concepts out loud — completely free, no signup required.

Start Practicing Free → System Design Interview Guide