HomeЛюди и блогиRelated VideosMore From: Bert Wagner

One SQL Cheat Code For Amazingly Fast JSON Queries - SQL JSON Index performance

23 ratings | 2357 views
How non-persisted computed column indexes make your JSON queries high performance. Did you know that non-persisted columns actually get saved to the index pages? This makes JSON querying incredibly fast in SQL Server. Please like and subscribe if you enjoyed this video! Blog post: https://bertwagner.com/2017/05/09/one-sql-cheat-code-for-amazingly-fast-json-queries/ Want to receive the latest weekly blog posts and videos in your inbox? Sign up for the newsletter here: https://upscri.be/c77fc8/ Elsewhere on the internet: https://bertwagner.com https://twitter.com/bertwagner
Html code for embedding videos on your blog
Text Comments (8)
chobo2 (1 month ago)
Hey How would you handle with JSON if you wanted to normalize your data. Say for instance you have a pre-defined list of options lets take your car example, So you have Fuel: which could have Diesel or Gas). Now I probably would end up making some sort of table to hold these 2 choices. Now maybe I sell an array of different products that may not have fuel spect(maybe it is hand powered or it's a brick) so I would stick this in my json column. What happens if I want to store the id instead of the label name (ie Fuel: 1 vs Fuel : Gas") How would I later on doing a join so I can get the proper label name later on?
Bert Wagner (1 month ago)
Hi chobo2, thanks for watching! My understanding is that you want to transform this OLTP data into more of a normalized data warehouse format. I do not think this is possible to do in a single pass - if I needed to do this I would build multiple queries to normalize the data eg. first I would find my distinct fuel types and save them off into a table. I would then join those values back with the original parsed json to get the IDs, and insert those IDs into another table, etc... In essence, you are using the JSON functions to parse your data in pieces and parts as you need it while following a traditional data warehousing ETL process. Hope that makes sense!
Nam Ngo (8 months ago)
Hey Bert, nice meeting you at SQL Saturday in Rochester recently. I'm trying to apply this technique to a new proof of concept project I'm working on, and I had a question (not sure if you'll know the answer or have a solution, but I figured I'd ask anyway). Long story short, I have JSON data with specific fields and I want to create this non-persisted computed column to index, but I don't want the index value to appear for all rows. Example: I have multiple rows in which they have an "Id" field in the JSON, but the rows differ with different structured JSON (let's say structure A and structure B). I want to index JSON structure A on the Id, but only if it matches this structure (and exclude, or make the value in this index null for non-structure A JSON even though structure B JSON has an Id field). Can you think of any way to accomplish this? Or am I going about this in a way too difficult mechanism?
Nam Ngo (8 months ago)
Thank you kindly. Your answer is as I feared. Thank you for the knowledge!
Bert Wagner (8 months ago)
Also, while you can't create a filtered index on the computed column, if you have another column that identifies the difference in data between rows, you can create a filtered index on that column and include your computed JSON column. --Filtered Index CREATE NONCLUSTERED INDEX FX_TypeAId ON dbo.IndexTest ( RecordType, TypeAId ) WHERE RecordType = 'A'; Once again, not sure of the performance implications of something like this, so be sure to test
Bert Wagner (8 months ago)
Hi Nam, great question! You can create a case statement in the computed column so in theory your idea should work. Not sure of the performance implications (I didn't have time to test it out), but it is possible to do. Thanks for the question! DROP TABLE IF EXISTS dbo.IndexTest CREATE TABLE dbo.IndexTest ( RecordType char(1), JsonData nvarchar(100) ); GO INSERT INTO dbo.IndexTest VALUES ('A',N'{"Id": 1}'); INSERT INTO dbo.IndexTest VALUES ('B',N'{"Id": 2}'); INSERT INTO dbo.IndexTest VALUES ('A',N'{"Id": 3}'); INSERT INTO dbo.IndexTest VALUES ('B',N'{"Id": 4}'); GO ALTER TABLE dbo.IndexTest ADD TypeAId AS CASE WHEN RecordType = 'A' THEN JSON_VALUE(JsonData, '$.Id') ELSE NULL END; SELECT * FROM dbo.IndexTest -- Regular Index CREATE NONCLUSTERED INDEX IX_TypeAId ON dbo.IndexTest ( TypeAId ); GO
Chad Hynes (1 year ago)
What's the difference between a persisted computed column, and an indexed non-persisted computed column then? Isn't an index being stored "on disk"?
Bert Wagner (1 year ago)
Great question Chad. While both persisted computed columns and indexed non-persisted computed columns are stored on disk, the persisted computed column gets stored on the data page, while the indexed non-persisted computed column gets stored on an index page. Depending on how you are querying your data, the data on the index page might get returned faster via an index seek versus that same value stored in a data page retrieved via a table scan. This is just a rule of thumb though, it might not always be the case.

Would you like to comment?

Join YouTube for a free account, or sign in if you are already a member.