Merge join Vs Hash join Vs Nested loop join

Advertisements

This month’s TSQL Tuesday party is being hosted by Stuart R Ainsworth (Blog| Twitter). I am very glad to write my first blog post as t-SQL Tuesday post on my newly designed website.

SQL server has three types of internal joins. I know most of folks never heard this join type because it’s not logical join and it’s not often used in their codes.

Then, when it will be used?

Well the answer is “it depends”.

This means it depends upon the record sets and indexes. The query optimizer will be smart and always try to pick up the most optimal physical joins. As we know SQL optimizer creates a plan cost based and depends upon the query cost it will choose the best join.

How the query optimizer will choose the join type internally?

Well, there is some algorithm has written internally for the query optimizer to choose the join type.

Let’s go for some practical examples and will finally summarize it.
First I will give some basic idea how the join will work and when/How the optimizer will decide to use anyone of the internal join (Physical join).

• It depends upon the table size
• It depends upon the index on the join column
• It depends upon the Sorted order on the join column

Test:

The test has done by following configuration.

RAM: 4 GB
Server : SQL server 2008 (RTM)

create table tableA (id int identity ,name varchar(50))
declare @i int
set @i=0
while (@i<100)
begin
insert into tableA (name)
select name from master.dbo.spt_values
set @i=@i+1
end
--select COUNT(*) from dbo.tableA --250600
go
create table tableB (id int identity ,name varchar(50))
declare @i int
set @i=0
while (@i<100)
begin
insert into tableB (name)
select name from master.dbo.spt_values
set @i=@i+1
end
-- select COUNT(*) from dbo.tableB --250600
select * from dbo.tableA A join tableB B
on (a.id=b.id)

Test1: Without Index

Let’s create a clustered index

create unique clustered index cx_tableA on tableA (id)
create unique clustered index cx_tableB on tableB (id)

Test1: With Index

If either of the table has indexed then it goes hash join. I haven’t shown this picture here. You can drop either of the table indexes and test it.

Test2: Without Index

Let’s create a medium table

create table tableC (id int identity,name varchar(50))
insert into tableC (name)
select name from master.dbo.spt_values
-- select COUNT(*) from dbo.tableC --2506
create table tableD (id int identity,name varchar(50))
insert into tableD (name)
select name from master.dbo.spt_values
select * from dbo.tableC C join tableD D
on (C.id=D.id)
-- select COUNT(*) from dbo.tableD --2506

Test2: With Index

Let’s create a clustered index

create unique clustered index cx_tableC on tableC (id)
create unique clustered index cx_tableD on tableD (id)

If either of the table has indexed then it goes merge join. I haven’t shown this picture here. You can drop either of the table indexes and test it.

Test3: Without Index

create table tableE (id int identity,name varchar(50))
insert into tableE (name)
select top 10 name from master.dbo.spt_values
-- select COUNT(*) from dbo.tableE --10
create table tableF (id int identity,name varchar(50))
insert into tableF (name)
select top 10 name from master.dbo.spt_values
-- select COUNT(*) from dbo.tableF --10

Let’s create a clustered index

create unique clustered index cx_tableE on tableE (id)
create unique clustered index cx_tableF on tableF (id)

Test3: With Index

If either of the table has indexed then it goes Nested loop join. I haven’t shown this picture here. You can drop either of the table indexes and test it.

You can also join tables vice versa like big table Vs Medium table Vs small table

select * from dbo.tableA A join tableC C
on (a.id=C.id)

select * from dbo.tableA A join tableE E
on (a.id=E.id)

select * from dbo.tableC C join tableE E
on (C.id=E.id)

In this case if all the table has indexed then it goes Nested loop join. If they don’t then hash join. If either of the table has indexed then it goes Nested loop join. I haven’t shown this picture here.

Still you can force optimizer to use any one of the internal joins, but it’s not good practice. The query optimizer is smart it will dynamically choose the best one.

Here just I used the merge hint so the optimizer goes to merge join instead of a hash join (Test1 without an index)

select * from dbo.tableA A join tableB B
on (A.id=B.id)option (merge join)

select * from dbo.tableA A inner merge join tableB B
on (A.id=B.id)

Table 1: Test uses a unique clustered index

From the table diagram:

If both the tables have NO index then the query optimizer will choose “Hash joins” internally.
If both the tables have indexes then the query optimizer will choose “Merge (For big tables) /Nested loop (For small tables)” internally.
If either of the tables have indexes then the query optimizer will choose “Merge (For medium tables) /Hash (For big tables) /Nested loop (For small & big Vs small tables)” internally.

Table 1: Test using clustered index

(create clustered index cx_tableA on tableA (id))

Table size	With index (Both)	Without Index(Both)	Either of table has index
Big (Both)	HASH	HASH	HASH
Medium (Both)	HASH	HASH	HASH
Small (Both)	NESTED LOOP	NESTED LOOP	HASH
Big Vs Small(medium)	HASH	HASH	HASH

From the table diagram:

This test has done using without a unique clustered index. See if the index created without unique keyword then there is no guarantee SQL will not know its unique (data) so it will create 4-byte integer GUID unique-identifier by default.

Look the diagram there is no MERGE join if the clustered index created without unique.

Thanks @Dave for your email 🙂 . Now the second diagram added.

Conclusion:

Merge Join

Merge join is possible for the tables have an index on the join column. The index either clustered or covering non-clustered index. It’s best join for this circumstance. Why it’s best join? Because it needs an index for both the tables. So it’s already sorted and easily match and return the data.

Hash Join

Hash joins is possible for tables with no index (or) either of the big tables has indexed. It’s best join for this circumstance. Why it’s best join? Because it’s worked great for big tables with no index and run the query parallel (more than one processor) and give the best performance. Most of folk says its heavy lifter join.

Nested loop Join

Nested loop join is possible for small tables with index (or) either of the big tables have indexed. It’s best join for this circumstance. Why it’s best join? Because it works great for small tables like, compares each row from one table to each row from the other table ‘looping’.

For more: Read Gail Shaw’s blog. It has Craig Freedman’s link too. Why I gave Gail’s link instead of directly give Craig‘s link. It’s worth reading and most of my readers are from India and they have to read all her posts.

I hope now you can understand how the query optimizer will choose the most optimal join types.

36 Comments

Mandy Fox

October 5, 2011 at 6:12 am Reply

Hi, hope you don’t mind me using your comment form to get in touch? It seemed the quickest and most pertinent method. As a fellow blogger, I couldn’t help noticing your site isn’t optimised for mobile users. Did you know the mobile market is 100 times bigger than the whole of the internet? There’s a video here that explains it all pretty well: http://ow.ly/6M4OF
If you have a few minutes, do have a look, I think you might be surprised at what you’re missing.
siva

October 5, 2011 at 6:37 pm Reply

Thanks for the post.
- Muthukkumaran kaliyamoorthy
  
  October 5, 2011 at 8:48 pm
  
  I’m glad you liked it.
Fausnaught

October 7, 2011 at 9:18 pm Reply

It is hard to find knowledgeable people on this specific topic, but you sound like you know very well what you’re writing about! Cheers
- Muthukkumaran kaliyamoorthy
  
  October 8, 2011 at 10:39 pm
  
  Thanks.
Abril

October 7, 2011 at 9:37 pm Reply

awesome blog, do you have twitter or facebook? i will bookmark this page thanks.
- Muthukkumaran kaliyamoorthy
  
  October 8, 2011 at 10:41 pm
  
  Thanks abril. My twitter and facebook accounts are there on the top page(follow me).
- Muthukkumaran kaliyamoorthy
  
  October 8, 2011 at 11:32 pm
  
  Where do you get/find my blog post links? Are you advertiser?
  
  Please don’t post spam comment.
Dave

October 10, 2011 at 3:55 pm Reply

Hello muthukkumaran

Thanks for the cool post. I run your scripts but its seems missed some lines in your first test script. Fix it.
- Muthukkumaran kaliyamoorthy
  
  October 10, 2011 at 9:20 pm
  
  Thanks Dave. Now it’s updated.
  
  Thanks for script testing & reading 🙂
Dave

October 10, 2011 at 4:24 pm Reply

Test1: With Index

I wrote a while loop as same like you but for me it shows hash join.

I want to confirm. can you fix your script!
- Muthukkumaran kaliyamoorthy
  
  October 10, 2011 at 9:23 pm
  
  Dave
  
  Can you tell me your sql version. Also can you post me your tested scripted.In my test it goes merge join.
  
  Thanks once again!
Skillington

October 12, 2011 at 8:12 am Reply

Excellent! it is the word that comes to my mind when I look at your blog and when I read what you have to say. Your opinions are very much alike mine. I happy that I decided to enter your website and find out what you are interested in.
Dave

October 12, 2011 at 7:45 pm Reply

Hi Thanks! Mine 2008 r2.

I have tested the same code but I created clustered index without unique and row counts are vary (250800).

Yep you are right if I created the clustered index as unique it shows merge.
- Muthukkumaran kaliyamoorthy
  
  October 12, 2011 at 11:56 pm
  
  @dave Yeah its expected one. SQL knows its unique and its easily match it.
  
  As i already said in my post it depends upon the following.
  
  • It depends upon the table size
  • It depends upon the index on the join column
  • It depends upon the Sorted order on the join column
- muthukkumaran
  
  October 16, 2011 at 11:18 pm
  
  Roundup post : http://codegumbo.com/index.php/2011/10/10/tsql2sday-roundup/
Ignatius Gannon

October 22, 2011 at 8:46 am Reply

Unusual icons on my desktop?
- Muthukkumaran kaliyamoorthy
  
  October 22, 2011 at 12:57 pm
  
  @Ignatius Gannon
  
  Couldn’t get it. You mean the image is not shown clear.
gam

October 23, 2011 at 4:00 am Reply

I really liked your blog! It helped me alot… Awesome. Exactly what I was looking for. Thanks!
- Muthukkumaran kaliyamoorthy
  
  October 23, 2011 at 11:04 am
  
  Thanks.I’m glad it helps you.
Cliff Konicki

October 26, 2011 at 7:58 pm Reply

whoah this blog is fantastic i love reading your articles. Keep up the good work! You know, a lot of people are looking around for this info, you can aid them greatly.
- Muthukkumaran kaliyamoorthy
  
  October 29, 2011 at 5:43 pm
  
  Thanks Cliff Konicki
- Gurupreet
  
  May 30, 2013 at 3:48 pm
  
  Nice Explanation
Sandeep

June 4, 2013 at 10:13 am Reply

Thanks Really Very good Explanation..
- Pavan Bhardwaj
  
  October 10, 2013 at 5:52 pm
  
  Really, awesome post
  Keep it up
  I like your work
  I am also an blogger on: Code Imagine
Andy Russell

July 10, 2014 at 5:57 pm Reply

Very useful article, well written and informative – thanks !
Pratap

August 5, 2014 at 1:41 am Reply

Hey very nice info…just learned abt them today and when tried more info for knowledge, was routed to ur blog..I really liked the comparison idea..good article..thanks for sharing the knowledge.
Britto

April 1, 2015 at 12:56 pm Reply

Awesome blog…
- suhas
  
  February 21, 2016 at 10:29 pm
  
  Awesome Blog. Explanation Physical joins made easy.
  Thanks you!
Mohd Tahir Siddiqui

November 30, 2016 at 4:24 pm Reply

Hi, Muthukkumaran kaliyamoorthy
I am much thankful of this post. You solve my confusions. God bless you.
- Muthukkumaran kaliyamoorthy
  
  December 21, 2016 at 10:29 am
  
  Hi Mohd,
  I am glad it helped you.
preeti chourasiya

February 8, 2017 at 12:35 pm Reply

awesome
Suresh Sanga

February 9, 2017 at 1:39 pm Reply

Good one with explanation using example,help to understand clearly,Thank U 🙂
- Muthukkumaran kaliyamoorthy
  
  February 16, 2017 at 2:13 pm
  
  Thanks! Glad that it helped.
Coim

September 20, 2019 at 1:31 pm Reply

Good article .thanks for posting pls update new articles like this
kangraevents

December 16, 2019 at 12:53 pm Reply

Thank You for an amazing article, it was really helpful for me and loved to read this.

36 Comments

Mandy Fox

siva

Fausnaught

Dave

Dave

Skillington

Dave

Ignatius Gannon

gam

Gurupreet

Sandeep

Andy Russell

Pratap

Britto

suhas

Mohd Tahir Siddiqui

preeti chourasiya

Suresh Sanga

Leave a Reply Cancel reply