[Spring] N+1 solution

[IMPORTANT] Trade-off !

DTO Projections:

  1. Pros:

    • Query only the specific data you need.

    • Can be more efficient in terms of database load and network transfer.

    • Allows for custom calculations and aggregations at the database level.

    • Gives you fine-grained control over the exact data being fetched.

  2. Cons:

    • Requires writing more complex queries.

    • The queries are often less reusable and more tied to specific use cases.

    • Can be harder to maintain if your data requirements change frequently.

Entity Graphs:

  1. Pros:

    • Work directly with JPA entities, which can be more intuitive for domain-driven design.

    • More flexible for reuse in different scenarios.

    • Can be easier to work with if you need to manipulate the entities after fetching.

  2. Cons:

    • As you pointed out, it fetches entire entities, which might include more data than needed for a specific use case.

    • Can be less efficient if you only need a small subset of the data.

    • Potential for over-fetching data, which could impact performance, especially with large datasets.

Your assessment is spot on:

  • DTO Projections allow you to query exactly what you need, often resulting in more efficient database operations.

  • Entity Graphs fetch complete entities and their relationships, potentially bringing more data from the database than strictly necessary for a specific use case.

The choice between these approaches often depends on your specific needs:

  • Use DTO Projections when you need very specific data, especially if it involves calculations or if you're dealing with large datasets where performance is critical.

  • Use Entity Graphs when you need to work with full entity objects, when your use cases vary and you need more flexibility, or when the extra data isn't a significant overhead.

In practice, many applications use a combination of both approaches, choosing the most appropriate one for each specific scenario. The key is to understand the trade-offs and select the right tool for each job.

[DTO Projection] A has many B case Solution

DTO Projections allow you to fetch only the data you need, potentially including data from related entities, in a single query.

First, define a DTO interface:

kotlinCopyinterface PostSummary {
    val id: Long
    val title: String
    val commentCount: Int
}

Then, add a method to your repository that uses this projection:

kotlinCopyinterface PostRepository : JpaRepository<Post, Long> {
    @Query("""
        SELECT p.id as id, p.title as title, COUNT(c.id) as commentCount 
        FROM Post p 
        LEFT JOIN p.comments c 
        GROUP BY p.id, p.title
    """)
    fun findAllPostSummaries(): List<PostSummary>
}

Usage:

kotlinCopyval postSummaries = postRepository.findAllPostSummaries()
postSummaries.forEach { summary ->
    println("${summary.title}: ${summary.commentCount} comments")
}

In this case, we're fetching a summary of posts with their comment counts in a single query, avoiding the N+1 problem entirely.

You can also use class-based projections for more flexibility:

kotlinCopydata class PostSummaryDto(
    val id: Long,
    val title: String,
    val commentCount: Int
)

interface PostRepository : JpaRepository<Post, Long> {
    @Query("""
        SELECT new com.example.PostSummaryDto(p.id, p.title, COUNT(c.id)) 
        FROM Post p 
        LEFT JOIN p.comments c 
        GROUP BY p.id, p.title
    """)
    fun findAllPostSummaries(): List<PostSummaryDto>
}

Both Entity Graphs and DTO Projections can significantly reduce the number of queries executed, thereby solving the N+1 problem. The choice between them often depends on your specific use case:

  • Use Entity Graphs when you need the full entity objects and their relationships.

  • Use DTO Projections when you only need specific fields or calculated values, especially when aggregating data from multiple entities.

Remember, the most appropriate solution can vary depending on your specific requirements, data model, and performance needs.

[Entity Graphs] A has many B case Solution

Entity Graphs allow you to specify which related entities should be fetched eagerly in a single query.

First, define the entity graph:

kotlinCopy@Entity
@NamedEntityGraph(
    name = "Post.withComments",
    attributeNodes = [NamedAttributeNode("comments")]
)
class Post(
    @Id val id: Long,
    val title: String,
    @OneToMany(mappedBy = "post")
    val comments: List<Comment>
)

@Entity
class Comment(
    @Id val id: Long,
    val content: String,
    @ManyToOne
    val post: Post
)

Then, use the entity graph in your repository:

kotlinCopyinterface PostRepository : JpaRepository<Post, Long> {
    @EntityGraph(value = "Post.withComments")
    override fun findAll(): List<Post>
}

Usage:

kotlinCopyval posts = postRepository.findAll()
// Now you can access comments without additional queries
posts.forEach { post ->
    println("${post.title}: ${post.comments.size} comments")
}

[DTO Projection] A has many B, B has many C solution

Excellent, thank you for specifying that you're using PostgreSQL. PostgreSQL offers some powerful features for handling this kind of hierarchical data retrieval. Let's update our example to use PostgreSQL-specific features for a more efficient query.

Here's an updated version using PostgreSQL's JSON aggregation functions:

  1. First, let's update our DTO classes:

data class AuthorSummaryDto(
    val id: Long,
    val name: String,
    val books: List<BookSummaryDto>
) {
    constructor(json: PGobject) : this(
        id = JsonPath.read(json.value, "$.id"),
        name = JsonPath.read(json.value, "$.name"),
        books = JsonPath.read<List<Map<String, Any>>>(json.value, "$.books").map { book ->
            BookSummaryDto(
                id = book["id"] as Long,
                title = book["title"] as String,
                reviews = (book["reviews"] as List<Map<String, Any>>).map { review ->
                    ReviewDto(
                        id = review["id"] as Long,
                        title = review["title"] as String,
                        rating = review["rating"] as Int
                    )
                },
                averageRating = book["averageRating"] as Double
            )
        }
    )
}
  1. Now, let's update our repository to use PostgreSQL's JSON functions:

kotlinCopyinterface AuthorRepository : JpaRepository<Author, Long> {
    @Query(value = """
        SELECT 
            json_build_object(
                'id', a.id,
                'name', a.name,
                'books', (
                    SELECT json_agg(
                        json_build_object(
                            'id', b.id,
                            'title', b.title,
                            'reviews', (
                                SELECT json_agg(
                                    json_build_object(
                                        'id', r.id,
                                        'title', r.title,
                                        'rating', r.rating
                                    )
                                )
                                FROM review r
                                WHERE r.book_id = b.id
                            ),
                            'averageRating', (
                                SELECT COALESCE(AVG(r.rating), 0)
                                FROM review r
                                WHERE r.book_id = b.id
                            )
                        )
                    )
                    FROM book b
                    WHERE b.author_id = a.id
                )
            )
        FROM author a
    """, nativeQuery = true)
    fun findAllAuthorsWithBooksAndReviews(): List<String>
}
  1. We'll need a service to parse the JSON result:

@Service
class AuthorService(private val authorRepository: AuthorRepository) {
    fun getAuthorsWithBooksAndReviews(): List<AuthorSummaryDto> {
        return authorRepository.findAllAuthorsWithBooksAndReviews()
    }
}
  1. Usage:

kotlinCopy@Service
class ReportingService(private val authorService: AuthorService) {
    fun generateReport() {
        val authorSummaries = authorService.getAuthorsWithBooksAndReviews()
        authorSummaries.forEach { author ->
            println("${author.name} has written ${author.books.size} books:")
            author.books.forEach { book ->
                println("  - ${book.title} (Average rating: ${book.averageRating})")
                println("    Reviews:")
                book.reviews.forEach { review ->
                    println("      * ${review.title} (Rating: ${review.rating})")
                }
            }
        }
    }
}

This approach has several advantages:

  1. It uses a single query to fetch all the data, avoiding the N+1 problem.

  2. It leverages PostgreSQL's JSON functions to structure the data hierarchically in the database.

  3. It allows for efficient retrieval of nested data structures.

  4. The JSON processing in the application layer allows for type-safe conversion to your DTO classes.

Remember to add the necessary dependency for JSON processing (if not already present):

gradleCopyimplementation 'com.fasterxml.jackson.module:jackson-module-kotlin'

This solution provides a good balance between query efficiency and data structure flexibility. It retrieves all necessary data in a single query while maintaining the hierarchical relationship between authors, books, and reviews.

[DTO Projection - direct response] A has many B, B has many C solution

If you want to return the JSON data directly as a response, that's even simpler and more efficient. Since PostgreSQL is already giving us the data in a structured JSON format, we can return it directly without any additional processing. Here's how you can do that:

  1. First, let's modify our repository to return the raw JSON:

interface AuthorRepository : JpaRepository<Author, Long> {
    @Query(nativeQuery = true, value = """
        SELECT 
            json_build_object(
                'id', a.id,
                'name', a.name,
                'books', (
                    SELECT json_agg(
                        json_build_object(
                            'id', b.id,
                            'title', b.title,
                            'reviews', (
                                SELECT json_agg(
                                    json_build_object(
                                        'id', r.id,
                                        'title', r.title,
                                        'rating', r.rating
                                    )
                                )
                                FROM review r
                                WHERE r.book_id = b.id
                            ),
                            'averageRating', (
                                SELECT COALESCE(AVG(r.rating), 0)
                                FROM review r
                                WHERE r.book_id = b.id
                            )
                        )
                    )
                    FROM book b
                    WHERE b.author_id = a.id
                )
            )
        FROM author a
    """)
    fun findAllAuthorsWithBooksAndReviewsAsJson(): List<String>
}
  1. Now, let's create a simple service to fetch this data:

@Service
class AuthorService(private val authorRepository: AuthorRepository) {
    fun getAuthorsWithBooksAndReviewsAsJson(): List<String> {
        return authorRepository.findAllAuthorsWithBooksAndReviewsAsJson()
    }
}
  1. Finally, let's create a controller to return this JSON data:

@RestController
@RequestMapping("/api/authors")
class AuthorController(private val authorService: AuthorService) {

    @GetMapping
    fun getAllAuthorsWithBooksAndReviews(): ResponseEntity<List<String>> {
        val jsonData = authorService.getAuthorsWithBooksAndReviewsAsJson()
        return ResponseEntity.ok(jsonData)
    }
}

[Entity Graph] A has many B, B has many C solution

Certainly! I'll demonstrate how to use Entity Graphs to solve the same problem with authors, books, and reviews. This approach will use JPA's EntityGraph feature to eagerly fetch the entire object graph in a single query.

First, let's define our entities with the appropriate annotations:

kotlinCopy@Entity
@NamedEntityGraph(
    name = "author-books-reviews",
    attributeNodes = [
        NamedAttributeNode(value = "books", subgraph = "books-subgraph")
    ],
    subgraphs = [
        NamedSubgraph(
            name = "books-subgraph",
            attributeNodes = [NamedAttributeNode("reviews")]
        )
    ]
)
class Author(
    @Id @GeneratedValue(strategy = GenerationType.IDENTITY)
    val id: Long = 0,
    
    val name: String,
    
    @OneToMany(mappedBy = "author", cascade = [CascadeType.ALL], fetch = FetchType.LAZY)
    val books: List<Book> = mutableListOf()
)

@Entity
class Book(
    @Id @GeneratedValue(strategy = GenerationType.IDENTITY)
    val id: Long = 0,
    
    val title: String,
    
    @ManyToOne(fetch = FetchType.LAZY)
    @JoinColumn(name = "author_id")
    val author: Author,
    
    @OneToMany(mappedBy = "book", cascade = [CascadeType.ALL], fetch = FetchType.LAZY)
    val reviews: List<Review> = mutableListOf()
)

@Entity
class Review(
    @Id @GeneratedValue(strategy = GenerationType.IDENTITY)
    val id: Long = 0,
    
    val title: String,
    val rating: Int,
    
    @ManyToOne(fetch = FetchType.LAZY)
    @JoinColumn(name = "book_id")
    val book: Book
)

Now, let's create a repository that uses this EntityGraph:

kotlinCopyinterface AuthorRepository : JpaRepository<Author, Long> {
    @EntityGraph(value = "author-books-reviews", type = EntityGraph.EntityGraphType.LOAD)
    override fun findAll(): List<Author>
}

With this setup, when you call findAll(), it will eagerly fetch all authors, their books, and the reviews for those books in a single query.

To use this in a service:

kotlinCopy@Service
class AuthorService(private val authorRepository: AuthorRepository) {
    fun getAllAuthorsWithBooksAndReviews(): List<AuthorSummaryDto> {
        val authors = authorRepository.findAll()
        return authors.map { author ->
            AuthorSummaryDto(
                id = author.id,
                name = author.name,
                books = author.books.map { book ->
                    BookSummaryDto(
                        id = book.id,
                        title = book.title,
                        reviews = book.reviews.map { review ->
                            ReviewDto(
                                id = review.id,
                                title = review.title,
                                rating = review.rating
                            )
                        },
                        averageRating = book.reviews.map { it.rating }.average()
                    )
                }
            )
        }
    }
}

data class AuthorSummaryDto(
    val id: Long,
    val name: String,
    val books: List<BookSummaryDto>
)

data class BookSummaryDto(
    val id: Long,
    val title: String,
    val reviews: List<ReviewDto>,
    val averageRating: Double
)

data class ReviewDto(
    val id: Long,
    val title: String,
    val rating: Int
)

Usage example:

kotlinCopy@Service
class ReportingService(private val authorService: AuthorService) {
    fun generateReport() {
        val authorSummaries = authorService.getAllAuthorsWithBooksAndReviews()
        authorSummaries.forEach { author ->
            println("${author.name} has written ${author.books.size} books:")
            author.books.forEach { book ->
                println("  - ${book.title} (Average rating: ${book.averageRating})")
                println("    Reviews:")
                book.reviews.forEach { review ->
                    println("      * ${review.title} (Rating: ${review.rating})")
                }
            }
        }
    }
}

This Entity Graph approach has several advantages:

  1. It solves the N+1 problem by fetching all related entities in a single query.

  2. It works with your existing entity model without requiring changes to your database schema.

  3. It's more JPA-standard and can be easier to understand for developers familiar with JPA.

  4. It allows you to work directly with entity objects, which can be beneficial if you need to perform operations on these entities.

However, it also has some potential drawbacks:

  1. It always fetches the entire object graph, which might be more data than you need in some cases.

  2. For very large datasets, it might lead to performance issues due to the size of the result set.

  3. It doesn't provide as much flexibility in terms of custom aggregations compared to the SQL/JSON approach.

Choose the approach that best fits your specific use case and performance requirements. The Entity Graph approach is often a good choice when you need the full entity objects and when the result set isn't too large.

Last updated

Was this helpful?