SQL - Basics & Advanced

SQL - Basics & Advanced

·

4 min read

Relational Database Management Systems

An RDBMS uses the client-server model. This means that it’s a server process that responds to client requests rather than relying on another program to send requests to the server.

SQL is divided into two basic categories:

  • Data definition language (DDL): This is the set of SQL statements that describe the structure of the data to the server. This is how the data has to be.

  • Data manipulation language (DML): This is the set of SQL statements that tell the server what we want to do with the data. Generally, this is limited to create, retrieve, update, and delete commands.

Relationships between these entities:

One-to-one (1-1): One business entity is related to another single business entity.

  • One-on-One meeting with your manager.

One-to-many (1-n): One business entity is related to many other business entities.

  • Your manager leading 3 teams at a time.

Many-to-many (m-n): Many business entities are related to many other business entities.

  • In Learn, Many students can be enrolled in many courses.

Many-to-one ( m-1): Many business entities are related to another single business entities.

  • Your ADO dashboard, multiple tickets are assigned to you ( one ).

show databases;

use bblearn_git;

show tables;

create table customers ( id int(11) not null auto increment, name varchar(255), primary key(id) );

CRUD

insert into customers ( name ) values ( ' John Wick ' );

select * from customers;

  • Filter - Where clause

  • Sort - Order by

  • Count - count(*)

  • Pagination - limit-offset, keyset, cursor

update customers set name = ' Baabayaga ' where id = 1;

delete from customers where id = 1;


Add a column - alter table customers add column(identity varchar(32) not null);

Indexes

Searching for a particular customer inside a table that has two customers is quick. The response is returned within 0.00 seconds. Is the response equally fast when a table has two million customers, though? No way.

To speed up the response of these queries, we need to use indexes. An index is a tool that will help us search. MySQL uses the index in the background to retrieve data quickly.

create index customers_identity_idx on customers (identity);

TIP: It’s generally helpful to follow the <table_name>_<column>_idx pattern when naming our indexes.

For Multiple columns :

create index customers_name_identity_idx on 
customers(name,identity);

TIP: Also, it ensures that we can’t have null values on the primary key column.

drop index customers_identity_idx on customers; -- To remove an index

At this time, duplicates are allowed in the above index - 'Business Error'.

Solution: Use unique index

create unique index customers_identity_uidx on customers(identity);

TIP: A primary key index is created by default when a table is created with a primary key specified.

You see that your query to retrieve a customer by name takes a long time to finish. What should you look for when trying to solve this problem?

You should introduce an index on the name column of the customers table.

Performance:

Having index - We don’t need to add an index to a table if we don’t want it. The presence of an index on one or more columns renders the DML commands (insert, update, delete) a bit slower. This is because except from the fact that the actual table content needs to be amended, it is also necessary to update the content of the corresponding indexes.


Pagination - Choosing the Right Technique:

In PostgreSQL, choosing the right pagination technique depends on the specific requirements of your application and the characteristics of your data. Two common pagination techniques are OFFSET and LIMIT, and keyset pagination.

  1. OFFSET and LIMIT:

    • Pros:

      • Simple and straightforward.

      • Easy to implement.

    • Cons:

      • Can be inefficient for large OFFSET values, as PostgreSQL needs to skip rows before returning the result.

      • Performance degradation as the offset increases.

    SELECT * FROM your_table ORDER BY some_column OFFSET 10 LIMIT 10;
  1. Keyset Pagination (or Seek Method):

    • Pros:

      • More efficient for large datasets, especially when using indexed columns.

      • Consistent performance, regardless of the page number.

    • Cons:

      • Requires a unique, ordered column for pagination (e.g., timestamp or numeric ID).

      • Slightly more complex to implement.

    SELECT * FROM your_table WHERE some_column > last_value ORDER BY some_column LIMIT 10;

This assumes that last_value is the last value from the previous page, and you're ordering by some_column.

  1. Cursor-based Pagination:

    • Pros:

      • Similar to keyset pagination but can be more efficient in certain scenarios.

      • Well-suited for real-time data.

    • Cons:

      • Requires maintaining a cursor for each query.

      • Requires an ordered, unique column.

    SELECT * FROM your_table WHERE some_column > last_value ORDER BY some_column LIMIT 10;

Similar to keyset pagination, but the last_value here is a cursor representing the position in the result set.

Choosing the Right Technique:

  • For small datasets or cases where you don't expect many pages, OFFSET and LIMIT might be sufficient.

  • For larger datasets, especially when ordered by an indexed column, keyset pagination is often more efficient.

  • Cursor-based pagination can be useful for real-time data where the dataset is frequently changing.

Always analyze the specific needs of your application, the size of your dataset, and the nature of your queries before choosing a pagination technique in PostgreSQL. It's also recommended to use proper indexing on the columns involved in ordering to optimize query performance.

Footer

https://www.educative.io/courses/getting-started-sql-relational-databases