Thursday, March 29, 2012

comparing two records field by field

Hi,
a stored proc should find all different fields in two records. This two
records are in two temp tables and have the same fields, but the
recordstructure can be different in every call of the stored proc.
For example, in one call i check the difference of two records from the
customer table, in the next call i check the difference of two records from
the orders table.
So i cannot use field names. In the moment i read all informations from
sysobjects and syscolumns and build sql strings for each column and execute
them to get the value as varchar and compare them.
But this is so performance- and timeconsuming, especially for records with
nearly 200 fields, that i have to find another way.
Any suggestions how to do this?
thanks for all tips,
HelmutIt depends on what your ultimate aim is (other than pain). If you're
just wanting some data analysis there's a tool called SQL data compare
that does that quite well.
If you're aiming for some nice dynamic query, you'll have to post a SQL
example (I'm afraid your description was a tad cryptic).
my approach when comparing things dynamically is to use the stored proc
sp_columns. Shove that up a temporary table, use sp_executesql to then
execute the compare, however it sounds like you're nearly doing that
(shouldn't be too bad performance wise - are you hitting the cursors a
bit too hard?)
I think you'll get better advice if you put some hard examples up of
what you're currently doing, and what you're wanting it to do.
As a load of people write as a generic response - "post your DDL and
code examples and maybe we can help more" - they say "more" as though
that was in some way helpful
anyway,
more info dude
Cheers
Will|||Hello Will,
thanks for your response. Here an example:
let's say, i have two backups of a database. Now i restore both backups and
have a stored proc wich loops through the persons table record by record.
There is an identity field, so i can read the same record from both backups
...
insert into #tbl1 select * from backup1.dbo.persons where id = 123
insert into #tbl2 select * from backup2.dbo.persons where id = 123
...
if both records exists, then i want to find the differences field by field,
for this i call another stored proc, and this proc knows, that it has one
record in #tbl1 and another record in #tbl2 with identical structure.
But it does not know, that this are records from table persons!
So it should do something like:
declare @.tblDiff (
fieldname varchar(20),
oldValue varchar(2000),
newValue varchar(2000) )
for x = 1 to #tbl1.fieldcount do begin
if (#tbl1.field[x].IsNull and not #tbl2.field[x].IsNull) or
(not #tbl1.field[x].IsNull and #tbl2.field[x].IsNull) or
(#tbl1.field[x].Value <> #tbl2.field[x].Value) then
insert into @.tblDiff values(#tbl1.field[x].Name, #tbl1.field[x].Value,
#tbl2.field[x].Value)
end
return select * from @.tblDiff
This is a pseudo code to show what i want to do. Now i need to solve this
in pure TSQL.
thanks,
Helmut|||You could try something like this, but it sounds like you're already at
this solution.
DECLARE @.Table varchar(50)
SET @.Table = 'sysobjects'
DECLARE @.IDvalue int
CREATE TABLE #Cols
(
TABLE_QUALIFIER sysname,
TABLE_OWNER sysname,
TABLE_NAME sysname,
COLUMN_NAME sysname,
DATA_TYPE smallint,
TYPE_NAME sysname,
[PRECISION] int,
LENGTH int,
SCALE smallint,
RADIX smallint,
NULLABLE smallint,
REMARKS varchar(254),
COLUMN_DEF nvarchar(4000),
SQL_DATA_TYPE smallint,
SQL_DATETIME_SUB smallint,
CHAR_OCTET_LENGTH int,
ORDINAL_POSITION int,
IS_NULLABLE varchar(254),
SS_DATA_TYPE tinyint
)
CREATE TABLE #Results(field sysname, OldValue nvarchar(4000), NewValue
nvarchar(4000))
INSERT INTO #Cols
exec sp_Columns @.Table
DECLARE Cols Cursor
FOR SELECT Colname
FROM #Cols
DECLARE @.Colname sysname
OPEN Cols
FETCH NEXT FROM cols INTO @.COlname
WHILE @.@.FETCH_STATUS = 0
BEGIN
DECLARE @.SQL nvarchar(4000)
SET @.SQL = 'INSERT INTO #Results SELECT ''' + @.Colname + ''', T1.' +
@.Colname + ', T2.' + @.ColName + ' FROM database1.dbo.' + @.tablename + '
as T1 INNER JOIN database2.dbo.' + @.TableName + ' as T2 on T1.IDField =
' + @.IDValue + ' AND T2.IDField = ' + @.IDValue + 'WHERE t1.' + @.Colname
+ ' != T2.' + @.Colname + ' AND NOT (T1.' + @.ColName + ' IS NULL AND
T2.' + @.ColName + ' IS NULL)'
exec sp_executesql @.SQL
FETCH NEXT FROM cols INTO @.COlname
END
CLOSE Cols
DEALLOCATE Cols
DROP TABLE #Cols
SELECT * FROM #REsults
DROP TABLE #Results|||Another soloution would be to use
BINARY_CHECKSUM()
HTH, Jens Suessmeyer.
http://www.sqlserver2005.de
--|||Am 6 Apr 2006 01:47:28 -0700 schrieb Jens:

> Another soloution would be to use
> BINARY_CHECKSUM()
>
> HTH, Jens Suessmeyer.
> --
> http://www.sqlserver2005.de
> --
Not really, at first, BOL says, that binary_checksum() can detect most, but
not all changes. And second, this can be used maybe in the calling function
to detect, if a difference is here. But my problem is the second step -
finding and documenting the changes.
bye, Helmut|||Hi Will,
thanks for this source. What i see at first, i need much more steps to find
the differences, so maybe you solution is faster (i hope).
But the main points are the same, information comes from sysobjects and for
every field i have to build a statement and do a sp_executesql.
But on the wend i take this source and make a benchmark and post the
result on monday.
Do you think, it would be much faster using an external stored proc?
thank you very much,
Helmut|||What are you trying to do Helmut? some kind of a retrospective audit
trail? have you considered putting triggers on your tables to maintain
this data as it changes?|||Am 6 Apr 2006 03:07:00 -0700 schrieb Will:

> What are you trying to do Helmut? some kind of a retrospective audit
> trail? have you considered putting triggers on your tables to maintain
> this data as it changes?
Yes, it should do both. For audit trail i use the data from inserted and
deleted. Working with columns_updated() is no solution, because i only
store changes and this means, if i have a record with 200 fields and only 5
fields are filled with data (and the rest is NULL), then i store only this
5 values. But when you do an insert, then columns_updated() has all bits
set.
bye, Helmut|||Helmut,
I'm afraid I think that your query is just about as fast as you'll get
it (I certainly can't think of any significant ways to improve it).
I've had one idea you could try, but I don't have time to fully
investigate it:
You could rather than performing the difference check on each column,
build up the SQL string as 'select inserted.'+colname+',
deleted.'+colname,+' CAST(CASE when insert.'+colname+' !=
deleted.'+colname+' then 1 else 0 end as bit) as ' + colname +
'ischanged' .
then build up this string for the whole set of columns, only do one
select into a temporary table, then do your inserts into your audit
table based on whether or not the bit field conameischanged has been
set to 1.
I have no idea if that will work, it has the advantage that you only
select from your 2 tables once as opposed to once per column, but it
has the divantage of needing an extra insert into a temporary table.
if you do write it then post the source and the benchmarks, I'd be
interested to know how it performs. If this isn't clear let me know,
I'll try and write some SQL this evening.
Cheers
Will

No comments:

Post a Comment