
No Comments Yet
Be the first to share your thoughts and start the conversation.
Be the first to share your thoughts and start the conversation.
How did you manage to remove the blur property and reach here?
Upgrading gives you access to quizzes so you can test your knowledge, track progress, and improve your skills.
By logging in, you'll unlock full access to this and other free tutorials on JSM Pro.
Why? Logging in lets us personalize your learning experience, track your progress, and keep you in the loop with new workshops, coding tips, and platform updates.
You'll also be the first to know about upcoming launches, events, and exclusive discounts.
No spam—just helpful content to level up your skills.
If that sounds fair, go ahead and log in to continue →
Enter your name and email to get instant access
In this lesson, we explore the fundamental concepts of database structure, specifically focusing on best practices for creating scalable applications. By comparing an older codebase with a new approach, we discuss the use of arrays of references and multiple collections within a MongoDB schema for managing relationships effectively.
00:00:02 In this lesson, I'll share with you the entire database structure of our application, of all of the collections and models and instances that we'll develop
00:00:12 very soon.
00:00:13 But before I go ahead and do that, I actually want to share with you a snippet from the old code base.
00:00:19 As you know, I'm constantly keeping this course updated so you can learn about the latest and greatest features of whatever technologies we're using.
00:00:27 And sometimes I notice some improvements that I can make over the codebase structures that I used in the previous versions of codebases.
00:00:34 And we can use that as a great learning example.
00:00:37 Although this solution was fully functional, it was only a good solution, but it wasn't scalable enough for a couple of reasons.
00:00:45 Specifically, we're looking at a question schema right here.
00:00:48 That is this one right here.
00:00:50 The center and the core of our application, where people can ask different questions.
00:00:55 The main problem was using arrays of references for managing multiple relations.
00:01:00 You can see that we have done that for the tags, as well as for the upvotes, downvotes, answers, and more.
00:01:08 Although it's not bad, it has its own pros and cons.
00:01:12 When creating different collections in your MongoDB schema, you have to ask yourself whether you want to keep track of an array of references,
00:01:20 or you want to create multiple collections in MongoDB depending on the specific requirements of your application.
00:01:26 So let me give you an example.
00:01:27 You want to use array of references when you have many-to-many relationships.
00:01:33 So when a document type can relate to multiple instances of another type and vice versa, like students and courses for example,
00:01:42 students can be enrolled in many courses and each course can be enrolled in by many students.
00:01:48 Another example when to use arrays of references is when you have frequent queries on the related data, when you often need to retrieve the related documents together,
00:01:58 and when you need lightweight relationships, so when the related documents are not overly complex and don't require a separate structure.
00:02:07 A perfect example of this is when you have a student document with an ID of student one and name of John Doe, and he can be enrolled in multiple courses
00:02:16 like course one and course two, which is an array of references to course IDs.
00:02:22 And then you have course documents where you have a course, and then you provide an array of references to new student IDs that are enrolled in that course.
00:02:32 And this is the exact approach that we used right here to attach different answers to a question.
00:02:38 You can see that each question had an array of references to different answers.
00:02:44 But this also has its pros and cons.
00:02:47 The pro is that it simplifies the data retrieval, because you can just fetch related documents with a single query if they're stored together.
00:02:56 What does that mean?
00:02:57 That means that if you fetch a course in this case, you can immediately fetch all the students who are enrolled in that course.
00:03:05 Or if you fetch the student to display the student details page, you can also immediately fetch all of their courses.
00:03:11 Super simple.
00:03:13 Same thing for questions and answers.
00:03:15 If you fetch the question, you can immediately fetch all of the answers attached to that question.
00:03:20 And another pro is that it's easier to maintain referential integrity, as you can enforce constraints in your application logic.
00:03:28 In simple terms, this means that these documents are kind of connected, so when you're fetching a student, you're also fetching courses,
00:03:35 and vice versa.
00:03:37 But what are the cons?
00:03:38 The cons are that it can lead to large documents if these arrays grow significantly.
00:03:44 Like, let's say that you're just trying to fetch student details, but then that student can be enrolled in hundreds of different courses,
00:03:51 and then you have to fetch all of these courses by simply making a get student query.
00:03:56 And another one is that it requires extra queries to resolve references, which may affect performance.
00:04:02 So to fix those cons, An alternative is to create multiple collections.
00:04:08 I would advise doing this when you have complex relationships.
00:04:12 So when the relationship between documents are complex or when the entities are independent enough to warrant their own collection.
00:04:21 or where you have siloed data, if the related data has distinct life cycles or it's managed independently, or when you need flexibility and scalability,
00:04:30 when you want to allow for growth in either collection independently without impacting the other.
00:04:37 An example of this would be to have a student collection, like we have had before, but this time without the references,
00:04:44 have a course collection, like we have had before, but without the references of the user, and finally, add a third collection called enrollment,
00:04:53 which allows you to tie those two collections together, like the student ID is attached to this course ID.
00:05:01 and that counts as an enrollment.
00:05:03 Hopefully, now that you see this third collection, all of this is starting to make more sense.
00:05:08 The pros of this is that it promotes a clean separation of concerns, making your data model cleaner and easier to manage,
00:05:16 It facilitates better scalability, as you can independently scale collections and manage them without impacting others.
00:05:23 And it also reduces the size of the documents, preventing potential performance issues associated with large documents.
00:05:31 Of course, every solution comes with its cons.
00:05:34 So in this case, you require more complex queries, especially when retrieving the related data.
00:05:40 You cannot just say, give me the students and automatically fetch the courses that the students is enrolled in.
00:05:46 You now have to write a more proper query.
00:05:49 And another con is that it provides an additional overhead to maintain those relationships, such as ensuring consistency across multiple collections.
00:05:58 But if you do it properly, and if you learn how to write these complex queries, your queries will be faster and your database will be more scalable.
00:06:07 So this is kind of a final, too long, didn't read part of the document.
00:06:10 If the size of the arrays is manageable and the data is not too complex, an array of references might be okay.
00:06:18 But for complex relationships, multiple collections is better.
00:06:22 If you need to query data quickly, then embedding or using arrays of references can improve performance.
00:06:28 But for large datasets, multiple collections can help keep documents smaller.
00:06:33 Also, if you anticipate growth or changes in how entities relate, creating multiple collections can provide the flexibility and scalability.
00:06:41 But for smaller projects or less complex data models, using arrays of references may simplify development and reduce the overhead of managing multiple collections.
00:06:50 So as you can see, each approach has its pros and cons.
00:06:53 But in this case, I promised that I would teach you production level applications and not just applications that work.
00:07:00 For that reason, we are redoing the old structure with a new, much more scalable structure instead of relying solely on arrays of references for all of
00:07:10 the reasons that I have mentioned.
00:07:11 But let me explain those reasons on this current flow of the database structure.
00:07:16 Everything starts with a user.
00:07:18 They have some properties like ID, name, username, email, and so on.
00:07:23 We also have this reputation number, which is going to be very exciting as it'll play into our recommendation system.
00:07:29 Next, each user has an account.
00:07:33 That's pretty straightforward, right?
00:07:35 A user can also ask questions.
00:07:39 And this is a one-to-many relationship because each user can ask multiple questions.
00:07:44 But now where things get a bit more complicated is where each question can have multiple answers.
00:07:50 And then each answer can also have its own comments.
00:07:54 So if we store it in an array of references, it'll potentially become huge, hitting the document limit and slow down the querying due to the size of the array.
00:08:02 So if we separate these collections, it'll allow for faster querying and better management of these relationships.
00:08:09 By creating separate collections, you can modify the structure of each entity independently.
00:08:14 For instance, if you want to add more fields to the answer collection, like maybe an edited date, you can do that without affecting the questions or comments collections.
00:08:24 Overall, what I wanted to do in this video is to give you an overview of the database structure that we'll use for this project,
00:08:31 as well as some of the differences between relations and creating multiple collections, like that additional enrollment collection that I've showed you
00:08:39 in the student and course example.
00:08:40 I know this might seem like a lot right now, but trust me, everything will start making sense once we start creating our own models for this application.
00:08:49 So I'll now delete this old example that I wanted to show you, which means that we are ready to start setting up our new database structure.
00:08:56 So let's do that next.