﻿{"id":158,"date":"2014-10-11T10:52:38","date_gmt":"2014-10-11T10:52:38","guid":{"rendered":"http:\/\/www.codereview.co\/?p=158"},"modified":"2014-10-11T21:03:17","modified_gmt":"2014-10-11T21:03:17","slug":"storing-an-ip-address-in-a-sql-server-database","status":"publish","type":"post","link":"http:\/\/www.codereview.co\/index.php\/sql-server-tutorials\/storing-an-ip-address-in-a-sql-server-database\/","title":{"rendered":"Storing an IP address in a SQL Server database"},"content":{"rendered":"<p><span style=\"color: #000000;\"><strong>Scenario proposal<\/strong><\/span><\/p>\n<p><span style=\"color: #000000;\">Often databases are used with web-based interfaces and recording the IP address of the end user can assist with debugging, marketing, bandwidth planning and collation selection to name a few. In a scenario where each page access is logged, is there an optimal way to store IP addresses?<\/span><\/p>\n<p><span style=\"color: #000000;\"><strong>Solution<\/strong><\/span><\/p>\n<p><span style=\"color: #000000;\">This tutorial will work through two obvious options and an obscure one on different ways to store IP addresses in your database.<\/span><\/p>\n<p><span style=\"color: #000000;\">The first option is using a <strong>varchar(15)<\/strong>:<\/span><\/p>\n<ul>\n<li><span style=\"color: #000000;\">From CGI variables, the four octets are separated with periods (.) which form a string.<\/span><\/li>\n<li><span style=\"color: #000000;\">Writing this string as a varchar to the database is straight-forward.<\/span><\/li>\n<li><span style=\"color: #000000;\">The simplicity has some tradeoffs.<\/span>\n<ul>\n<li><span style=\"color: #000000;\">Some space is under-utilized whenever an octet is less than 100, and the mask always uses three bytes. (101.100.99.9 uses 12 characters.)<\/span><\/li>\n<li><span style=\"color: #000000;\">Octets need to parsed using either SUBSTRING or CHARINDEX to sort and search by geographic location (see here).<\/span><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p><span style=\"color: #000000;\">The second option is using <strong>four tinyints<\/strong>:<\/span><\/p>\n<ul>\n<li><span style=\"color: #000000;\">Octets range between 0 and 255 making them perfect matches for tinyints and therefore storage space is perfectly utilized.<\/span><\/li>\n<li><span style=\"color: #000000;\">Some parsing will need to be done when writing the data, but the benefits are many.<\/span>\n<ul>\n<li><span style=\"color: #000000;\">Because they are numbers, sorting will work as expected.<\/span><\/li>\n<li><span style=\"color: #000000;\">A clustered index can be created on all four fields to improve performance.<\/span><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p><span style=\"color: #000000;\">A view can be created to display the original string&#8230;<\/span><\/p>\n<p><span style=\"color: #000000;\"><em>CREATE VIEW V_IP_ADDRESS AS<\/em><\/span><\/p>\n<p><span style=\"color: #000000;\"><em>SELECT IP_1, IP_2, IP_3, IP_4,<\/em><\/span><\/p>\n<p><span style=\"color: #000000;\"><em>CAST(IP_1) AS VARCHAR + &#8216;.&#8217; +<\/em><\/span><\/p>\n<p><span style=\"color: #000000;\"><em>CAST(IP_2) AS VARCHAR + &#8216;.&#8217; +<\/em><\/span><\/p>\n<p><span style=\"color: #000000;\"><em>CAST(IP_3) AS VARCHAR + &#8216;.&#8217; +<\/em><\/span><\/p>\n<p><span style=\"color: #000000;\"><em>CAST(IP_4) AS VARCHAR<\/em><\/span><\/p>\n<p><span style=\"color: #000000;\"><em>AS IP_ALL<\/em><\/span><\/p>\n<p><span style=\"color: #000000;\"><em>FROM IP_ADDRESS<\/em><\/span><\/p>\n<p><span style=\"color: #000000;\">&#8230;as can a computed column.<\/span><\/p>\n<p><span style=\"color: #000000;\"><em>CREATE TABLE IP_ADDRESS (<\/em><\/span><\/p>\n<p><span style=\"color: #000000;\"><em>IP_1 TINYINT NOT NULL,<\/em><\/span><\/p>\n<p><span style=\"color: #000000;\"><em>IP_2 TINYINT NOT NULL,<\/em><\/span><\/p>\n<p><span style=\"color: #000000;\"><em>IP_3 TINYINT NOT NULL,<\/em><\/span><\/p>\n<p><span style=\"color: #000000;\"><em>IP_4 TINYINT NOT NULL,<\/em><\/span><\/p>\n<p><span style=\"color: #000000;\"><em>IP_ALL AS<\/em><\/span><\/p>\n<p><span style=\"color: #000000;\"><em>CAST(IP_1 AS VARCHAR(3)) + &#8216;.&#8217; +<\/em><\/span><\/p>\n<p><span style=\"color: #000000;\"><em>CAST(IP_2 AS VARCHAR(3)) + &#8216;.&#8217; +<\/em><\/span><\/p>\n<p><span style=\"color: #000000;\"><em>CAST(IP_3 AS VARCHAR(3)) + &#8216;.&#8217; +<\/em><\/span><\/p>\n<p><span style=\"color: #000000;\"><em>CAST(IP_4 AS VARCHAR(3))<\/em><\/span><\/p>\n<p><span style=\"color: #000000;\"><em>PERSISTED,<\/em><\/span><\/p>\n<p><span style=\"color: #000000;\"><em>CONSTRAINT PK_IP PRIMARY KEY (IP_1, IP_2, IP_3, IP_4)<\/em><\/span><\/p>\n<p><span style=\"color: #000000;\"><em>)<\/em><\/span><\/p>\n<p><span style=\"color: #000000;\">Note that this example creates a clustered index on the four IP octets immediately. Another option is an autonumber Primary Key (PK) to start, then dropping that and creating a clustered index on the IP segments later.<\/span><\/p>\n<p><span style=\"color: #000000;\">The third option is using <strong>four binary(1)s<\/strong>:<\/span><\/p>\n<ul>\n<li><span style=\"color: #000000;\">For completeness, four binary(1)s use the same amount of disk space as four tinyints, but their use is more complex.<\/span><\/li>\n<li><span style=\"color: #000000;\">Writing will require parsing and casting.<\/span>\n<ul>\n<li><span style=\"color: #000000;\">To write, each octet needs to be parsed and casted [to 0x__] using this function master.dbo.fn_varbintohexstr(): master.dbo.fn_varbintohexstr(cast (IP_1 as varbinary)).<\/span><\/li>\n<li><span style=\"color: #000000;\">A computed column or view will need to cast to tinyint then varchar to return the original address.<\/span><\/li>\n<\/ul>\n<\/li>\n<li><span style=\"color: #000000;\">The main benefit of this approach is to obfuscate anyone viewing the database directly, and chances are if they are viewing your data they can probably also read hexadecimal.<\/span><\/li>\n<\/ul>\n<p><span style=\"color: #000000;\"><em>CREATE TABLE IP_ADDRESS_BIN (<\/em><\/span><\/p>\n<p><span style=\"color: #000000;\"><em>IP_1 binary(1) NOT NULL,<\/em><\/span><\/p>\n<p><span style=\"color: #000000;\"><em>IP_2 binary(1) NOT NULL,<\/em><\/span><\/p>\n<p><span style=\"color: #000000;\"><em>IP_3 binary(1) NOT NULL,<\/em><\/span><\/p>\n<p><span style=\"color: #000000;\"><em>IP_4 binary(1) NOT NULL,<\/em><\/span><\/p>\n<p><span style=\"color: #000000;\"><em>IP_ALL AS<\/em><\/span><\/p>\n<p><span style=\"color: #000000;\"><em>CAST(cast(IP_1 as tinyint) AS VARCHAR(3)) + &#8216;.&#8217; +<\/em><\/span><\/p>\n<p><span style=\"color: #000000;\"><em>CAST(cast(IP_2 as tinyint) AS VARCHAR(3)) + &#8216;.&#8217; +<\/em><\/span><\/p>\n<p><span style=\"color: #000000;\"><em>CAST(cast(IP_3 as tinyint) AS VARCHAR(3)) + &#8216;.&#8217; +<\/em><\/span><\/p>\n<p><span style=\"color: #000000;\"><em>CAST(cast(IP_4 as tinyint) AS VARCHAR(3))<\/em><\/span><\/p>\n<p><span style=\"color: #000000;\"><em>PERSISTED,<\/em><\/span><\/p>\n<p><span style=\"color: #000000;\"><em>CONSTRAINT PK_IP_BIN PRIMARY KEY (IP_1, IP_2, IP_3, IP_4)<\/em><\/span><\/p>\n<p><span style=\"color: #000000;\"><em>)<\/em><\/span><\/p>\n<p><span style=\"color: #000000;\">Here is a comparison of the three options outlined above:<\/span><\/p>\n<table>\n<thead>\n<tr>\n<td colspan=\"7\"><span style=\"color: #000000;\"><strong>Comparison Table<\/strong><\/span><\/td>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td><span style=\"color: #000000;\"><strong>Datatype<\/strong><\/span><\/td>\n<td><span style=\"color: #000000;\"><strong>Storage<\/strong><\/span><\/td>\n<td><span style=\"color: #000000;\"><strong>Writing<\/strong><\/span><\/td>\n<td><span style=\"color: #000000;\"><strong>Displaying<\/strong><\/span><\/td>\n<td><span style=\"color: #000000;\"><strong>Sorting<\/strong><\/span><\/td>\n<td><span style=\"color: #000000;\"><strong>Drawback(s)<\/strong><\/span><\/td>\n<td><span style=\"color: #000000;\"><strong>Benefit(s)<\/strong><\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"color: #000000;\">varchar(15)<\/span><\/td>\n<td><span style=\"color: #000000;\">15 b<\/span><\/td>\n<td><span style=\"color: #000000;\">Verbatim<\/span><\/td>\n<td><span style=\"color: #000000;\">Verbatim<\/span><\/td>\n<td><span style=\"color: #000000;\">Parse needed<\/span><\/td>\n<td><span style=\"color: #000000;\">Highest disk space needed<\/span><\/td>\n<td><span style=\"color: #000000;\">Readability, ease to insert<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"color: #000000;\">four tinyints<\/span><\/td>\n<td><span style=\"color: #000000;\">4 b<\/span><\/td>\n<td><span style=\"color: #000000;\">Parse and one CAST<\/span><\/td>\n<td><span style=\"color: #000000;\">View or computed column<\/span><\/td>\n<td><span style=\"color: #000000;\">Native<\/span><\/td>\n<td><span style=\"color: #000000;\">Sew together with cast to recreate string<\/span><\/td>\n<td><span style=\"color: #000000;\">Ideal storage size, sorting, performance<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"color: #000000;\">four binary(1)s<\/span><\/td>\n<td><span style=\"color: #000000;\">4 b<\/span><\/td>\n<td><span style=\"color: #000000;\">Parse and two CASTs<\/span><\/td>\n<td><span style=\"color: #000000;\">View or computed column<\/span><\/td>\n<td><span style=\"color: #000000;\">Requires two casts to sort and read, code is harder to read\/maintain<\/span><\/td>\n<td><span style=\"color: #000000;\">Ideal storage size, obscure values<\/span><\/td>\n<td><span style=\"color: #000000;\">Obfuscate the data<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><span style=\"color: #000000;\"><strong>\u00a0<\/strong><\/span><\/p>\n<p><span style=\"color: #000000;\"><strong>Thanks for reading this article,<\/strong><\/span><\/p>\n<p><span style=\"color: #000000;\"><strong>Next steps :<\/strong><\/span><\/p>\n<ol>\n<li><span style=\"color: #000000;\"><b><strong>Add this script to your database toolkit<\/strong><\/b><\/span><\/li>\n<li><span style=\"color: #000000;\"><b><strong>Share this with your colleagues because Sharing is Learning<\/strong><\/b><\/span><\/li>\n<li><span style=\"color: #000000;\"><strong>Comment below if you need any assistance<\/strong><\/span><\/li>\n<\/ol>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Scenario proposal Often databases are used with web-based interfaces and recording the IP address of the end user can assist with debugging, marketing, bandwidth planning and collation selection to name a few. In a scenario where each page access is logged, is there an optimal way to store IP addresses? Solution This tutorial will work [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[32,3],"tags":[53,8,52,13,12,51],"_links":{"self":[{"href":"http:\/\/www.codereview.co\/index.php\/wp-json\/wp\/v2\/posts\/158"}],"collection":[{"href":"http:\/\/www.codereview.co\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.codereview.co\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.codereview.co\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"http:\/\/www.codereview.co\/index.php\/wp-json\/wp\/v2\/comments?post=158"}],"version-history":[{"count":4,"href":"http:\/\/www.codereview.co\/index.php\/wp-json\/wp\/v2\/posts\/158\/revisions"}],"predecessor-version":[{"id":248,"href":"http:\/\/www.codereview.co\/index.php\/wp-json\/wp\/v2\/posts\/158\/revisions\/248"}],"wp:attachment":[{"href":"http:\/\/www.codereview.co\/index.php\/wp-json\/wp\/v2\/media?parent=158"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.codereview.co\/index.php\/wp-json\/wp\/v2\/categories?post=158"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.codereview.co\/index.php\/wp-json\/wp\/v2\/tags?post=158"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}