Showing posts with label build. Show all posts
Showing posts with label build. Show all posts

Monday, March 19, 2012

Compare strings in text field

I am trying to build a simple search engine using Sql Server 2000 to scan information about approximatelly 20.000 products.

Heres what I am doing:

I created a table called keywords that contains a reference for each product.

keyword -> varchar(100)
items -> Text

keyword data example:

[keyword] [items]
car 1, 3, 5, 7
blue 3, 5
compact 1,7

I am not using clustered index.

To search basically I run the "AND" or "OR" to select the keywords I want to target.
I need to run another select that would compare the data in the items field depending of the condition selected. If "AND" clause is used I would need to compare all the items that contains the same reference, for example:

looking for car compact using "AND" clause
result = 1

looking for car compact using "OR" clause
result = 1,3,5,7

There is no table that holds references. The items are stored in a text field in the keyword table. I can compare data using script like AsP spliting the items by comma or space, but that can be too slow and use up a lot of RAM. Another solution would be to use a table to hold the references but that would affect performance dramatically because of the large number of records created and storage space used. One example, if I have 60.000 keywords and each keyword has an avereage of 200 references, I would have to generate 12 milion records.

I want to know if there is a function or routine in SQL server to compare matched references on the fly in the server between two or more fields and how should I do it.

In addition, in this scenario, how should a clustered index help?

Thanks

RodrigoNot sure i get it but from what i understand why do you have a , delimited string of the items

why not a seperate row for each item so that, you will get all the items in a query that you can then use easier...i take it the item is the foreign key for some record in another table|||If I am right, that physically generating the rows would fall in the scenario in the 12 milion scenario leading to unecessary records and lots of sql file pages to hold the data references. I 'd ratter have a little IO because I could cache the results. Do you have any suggestion?

Thanks
Rod

Sunday, February 12, 2012

Command delivery status in MSrepl_commands

I need to build some custom replication trouble shooter and I can't find
answers on couple of questions.
- How to determine status of command in MSrepl_commands? I mean status is
delivered this command to subscriber or not.
- How long delivered transactions reside in MSrepl_commands?
Have a look at the view msdistribution_status, which should be what you are
looking for. The code of the view is not encrypted and you could use it to
not do the group by and create your own version. If you want the actual
commands, then have a look at sp_browsereplcmds.
How long do commands reside in this table? If they have been read by all the
distribution agents involved and you don't have anonymous subscribers, the
commands will be removed by the cleanup agent. If a distribution agent isn't
synchronized or you have anonymous subscribers, they'll stay there until the
retention period is reached (72 hourd by default), and are then removed by
the cleanup agent.
Rgds,
Paul Ibison SQL Server MVP, www.replicationanswers.com
(recommended sql server 2000 replication book:
http://www.nwsu.com/0974973602p.html)