Comma Delimited Fields: Comparing data in two consecutive rows from a single table

I'm trying to come up with an elegant, simple way to compare two
consecutive values from the same table.

For instance:

SELECT TOP 2 datavalues FROM myTable ORDER BY timestamp DESC

That gives me the two latest values. I want to test the rate of
change of these values. If the top row is a 50% increase over the row
below it, I'll execute some special logic.

What are my options? The only ways I can think of doing this are
pretty ugly. Any help is very much appreciated. Thanks!

B.>> SELECT TOP 2 datavalues FROM myTable ORDER BY timestamp DESC;

That gives me the two latest values. I want to test the rate of
change of these values. If the top row is a 50% increase over the row
below it, I'll execute some special logic. <<

TIMESTAMP is a reserved word in Standard SQL, which matches T-SQL
DATETIME. Rows have no physical ordering in a table, so that part
makes no sense. Or it means that you still have a mental model of a
sequential file -- probably a clipboard with a single column for data
points and a single column for a timeclock imprint.

That is not how to think about it in an RDBMS. If you have an event,
then you need to show a duration. This is Einstein's physics and
Zeno's paradoxes. Furthermore, each row must represent a complete
fact in itself, not half a fact. Let's try again:

CREATE TABLE
(start_datavalue DECIMAL(8,4) NOT NULL,
end_datavalue DECIMAL(8,4) NOT NULL,
start_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP
NOT NULL PRIMARY KEY,
end_time TIMESTAMP, -- null means still open
..
);

As each sample is taken, do an insert and an update (pardon my
Standard SQL-92):

BEGIN ATOMIC

INSERT INTO Foobar
VALUES (:datavalue, (SELECT start_time FROM Foobar WHERE end_time IS
NULL),
CURRENT_TIMESTAMP, NULL, ..);

UPDATE Foobar
SET end_time = CURRENT_TIMESTAMP
WHERE end_time IS NULL;

END;

I think the code you want is a bit more complex. A change over time
needs to consider the time involved. It is one thing for a car to go
from 0 to 60 mph in 10 seconds and quite another for it to take 10
hours. But without considering the rate of change:

SELECT start_datavalue, end_datavalue, start_time, end_time
FROM Foobar
WHERE start_value/end_time >= 0.5000
OR start_value/end_time >= 2.0000);

This shows the rows where things doubled or halved in their timeslots,
you can add (end_time-start_time) to get the duration, do other
ratios, etc.|||--CELKO-- (jcelko212@.earthlink.net) writes:
> TIMESTAMP is a reserved word in Standard SQL, which matches T-SQL
> DATETIME. Rows have no physical ordering in a table, so that part
> makes no sense. Or it means that you still have a mental model of a
> sequential file -- probably a clipboard with a single column for data
> points and a single column for a timeclock imprint.

So why not? I bet the underlying business problem looks a whole like
that!

You know, most people who are using databases are trying to solve
real-world problems, not do exercises in relational algebra.

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp|||"--CELKO--" <jcelko212@.earthlink.net> wrote in message
news:18c7b3c2.0406021337.1ff11c89@.posting.google.c om...
> >> SELECT TOP 2 datavalues FROM myTable ORDER BY timestamp DESC;
> That gives me the two latest values. I want to test the rate of
> change of these values. If the top row is a 50% increase over the row
> below it, I'll execute some special logic. <<
> TIMESTAMP is a reserved word in Standard SQL, which matches T-SQL
> DATETIME. Rows have no physical ordering in a table, so that part
> makes no sense. Or it means that you still have a mental model of a
> sequential file -- probably a clipboard with a single column for data
> points and a single column for a timeclock imprint.
> That is not how to think about it in an RDBMS. If you have an event,
> then you need to show a duration. This is Einstein's physics and
> Zeno's paradoxes. Furthermore, each row must represent a complete
> fact in itself, not half a fact. Let's try again:

I have to disagree here Joe.

For example, one of our tables records the time a banner ad is served.
There's no "begin and end" there's an instant.|||IF EXISTS(
SELECT TOP 1
datavalues as firstdatavalue,
(SELECT TOP 1
datavalues
FROM myTable AS myTable2
WHERE myTable2.timestamp < myTable.timestamp
ORDER BY timestamp DESC
) AS nextdatavalue
FROM myTable
ORDER BY timestamp DESC
)
BEGIN
-- ** Do My special Logic
END

*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!|||Here's one that only needs to do one table scan...

IF
(SELECT
MAX(Top2.datavalues) / MIN(Top2.datavalues)
FROM
(SELECT
TOP 2 datavalues
FROM
myTable
ORDER BY timestamp DESC) AS Top2
) >= 2
BEGIN
...blah blah blah...
END

"Bryan Guilliams" <bryanguilliams@.hotmail.com> wrote in message
news:3d0adf3a.0406020801.22ece337@.posting.google.c om...
> I'm trying to come up with an elegant, simple way to compare two
> consecutive values from the same table.
> For instance:
> SELECT TOP 2 datavalues FROM myTable ORDER BY timestamp DESC
> That gives me the two latest values. I want to test the rate of
> change of these values. If the top row is a 50% increase over the row
> below it, I'll execute some special logic.
> What are my options? The only ways I can think of doing this are
> pretty ugly. Any help is very much appreciated. Thanks!
> B.|||On Wed, 2 Jun 2004 22:40:31 -0400, Jonathan Amend wrote:

>Here's one that only needs to do one table scan...

Nice! :-)

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)|||Hugo Kornelis <hugo@.pe_NO_rFact.in_SPAM_fo> wrote in message news:<499sb09pcde2ltgf2f9n610go5l63l4glh@.4ax.com>...

> Hi Bryan,
> IF (SELECT TOP 1 datavalues FROM myTable ORDER BY timestamp DESC) >
> (SELECT TOP 1 datavalues FROM
> (SELECT TOP 2 datavalues, timestamp
> FROM myTable ORDER BY timestamp DESC) A
> ORDER BY timestamp ASC) * 1.5
> BEGIN
> do something
> END
>
> Best, Hugo

Hugo, this works perfectly and is exactly what I was looking for! Thanks a ton!

B.|||You're right, of course, but a few words in my defense...

jcelko212@.earthlink.net (--CELKO--) wrote in message news:<18c7b3c2.0406021337.1ff11c89@.posting.google.com>...

> TIMESTAMP is a reserved word in Standard SQL, which matches T-SQL
> DATETIME. Rows have no physical ordering in a table, so that part
> makes no sense. Or it means that you still have a mental model of a
> sequential file -- probably a clipboard with a single column for data
> points and a single column for a timeclock imprint.

I changed the table and column names from the real ones out of habit.
The actual datetime column is called 'time_stamp', so no problem
there. And though rows have no physical ordering in a table, they
absolutely have a virtual ordering when you use an ORDER BY statement,
which I have.

> I think the code you want is a bit more complex. A change over time
> needs to consider the time involved. It is one thing for a car to go
> from 0 to 60 mph in 10 seconds and quite another for it to take 10
> hours. But without considering the rate of change:

Luckily, the problem is not that complex. I'm pulling data from
another source that already takes care of all of that. I can be
completely confident that the data are exactly x minutes apart and
need only act if the most recent value is significantly greater than
the previous one.

I very much appreciate the instruction and help. Everything you wrote
was very informative and I thank you!

B.|||>> I have to disagree here Joe. For example, one of our tables
records the time a banner ad is served. There's no "begin and end"
there's an instant. <<

Your banner ads have no duration? Wow! That means that if I want to
run 1000 ads, and a database that records times to 1/1000 of a second,
then I can put 1000 ads in each second :)

Time intervals in SQL are shown as either (start, finish) and (start,
duration) -- look up the stuff for the OVERLAPS predicate in Standard
SQL.|||>> So why not? I bet the underlying business problem looks a whole
like
that! <<

Copying a paper form directly into a RDBMS is not a good idea. This
problem sounds like a history; history is a series of durations.

>> You know, most people who are using databases are trying to solve
real-world problems, not do exercises in relational algebra. <<

That is like saying the people at Enron ignored math because they "are
trying to solve real-world problems, not do exercises in relational
algebra" :)|||--CELKO-- wrote:
> That is like saying the people at Enron ignored math because they "are
> trying to solve real-world problems, not do exercises in relational
> algebra" :)

Actually, they were trying to *avoid* real-world problems. ;)

Zach|||"--CELKO--" <jcelko212@.earthlink.net> wrote in message
news:18c7b3c2.0406030904.c61d8f3@.posting.google.co m...
> >> I have to disagree here Joe. For example, one of our tables
> records the time a banner ad is served. There's no "begin and end"
> there's an instant. <<
> Your banner ads have no duration? Wow! That means that if I want to
> run 1000 ads, and a database that records times to 1/1000 of a second,
> then I can put 1000 ads in each second :)

Serving them has no duration, no.

And we serve (and record the serving) of over 14 million banner ads a day.
I'll let you do the math.

> Time intervals in SQL are shown as either (start, finish) and (start,
> duration) -- look up the stuff for the OVERLAPS predicate in Standard
> SQL.

See, I'm not talking about time intervals, you are.|||>> Serving them has no duration, no. And we serve (and record the
serving) of over 14 million banner ads a day. <<

Perhaps I do not understand what "serving" means; can you give me a
scenario. I am a customer; I want to run a banner ad. My banner
needs to be up from Christmas to New Years day. It needs to run
between 0600 UTC to 1200 UTC everyday, and it is 5 seconds long. What
are you recording so that I get the exposure for which I am paying?|||On 4 Jun 2004 12:25:52 -0700, --CELKO-- wrote:

>>> Serving them has no duration, no. And we serve (and record the
> serving) of over 14 million banner ads a day. <<
> Perhaps I do not understand what "serving" means; can you give me a
> scenario. I am a customer; I want to run a banner ad. My banner
> needs to be up from Christmas to New Years day. It needs to run
> between 0600 UTC to 1200 UTC everyday, and it is 5 seconds long. What
> are you recording so that I get the exposure for which I am paying?

I should let Greg explain it himself, but I can't seem to hold myself back
...

The duration you describe would be just what is needed to select at any
given instant which ad(s) should be active. However, once the ad has been
selected and HTML has been emitted to have it appear on the page, the
instant that it was emitted will be recorded in the database. A time
instant, not a duration.|||"--CELKO--" <jcelko212@.earthlink.net> wrote in message
news:18c7b3c2.0406041125.3cd714a7@.posting.google.c om...
> >> Serving them has no duration, no. And we serve (and record the
> serving) of over 14 million banner ads a day. <<
> Perhaps I do not understand what "serving" means; can you give me a
> scenario. I am a customer; I want to run a banner ad. My banner
> needs to be up from Christmas to New Years day. It needs to run
> between 0600 UTC to 1200 UTC everyday, and it is 5 seconds long. What
> are you recording so that I get the exposure for which I am paying?

Banners don't have a duration. (Well, active content do, but let's come
back to that.).

So, given the criteria you've given, we'd charge you probably (and this
varies but for this we simplify) on number of impressions. I.e the number
of times it's been served.

So an end user goes to a page on our site. The code (generally ASP) goes to
a DB table and queries "what banner should I show." (there may be more than
one scheduled for say the top position on the page). A "coin flip" is made
and the DB hands back a URL to the banner in question. At that point it
records in the database that your banner was served.

Now, the end user may spend 10 seconds at that page or 10 minutes. At that
point we don't really care (and have limited ways of knowing anyway since
the web is generally stateless). You're not paying for that, you're paying
for impressions served.

Now, a standard banner is generally just a gif file, or perhaps an animated
gif. Active content may be fancier, like a flash ad (yuck) or Quicktime,
etc. But again, we don't record any of that, simply the time that the
banner was handed to the user.

This is fairly standard in terms of how banner ads are served. So, no
duration, simply an impression.|||>> Banners don't have a duration. (Well, active content do, but let's
come
back to that.) <<

Isn't that content and its duration what the buyer is paying for?

>> we'd charge you probably on number of impressions. I.e the number
of times it's been served. <<

I understand billing by the number of hits. But if you can put 1000
impressions of my ad in banner in one minute, I am not as happy as I
would be having them spaced out and retained at lest long enough for a
human being to read. When I record the hits, I record them in a time
slot -- you can only hit the banner when it is displayed.

>> So an end user goes to a page on our site. The code (generally
ASP) goes to
a DB table and queries "what banner should I show." A "coin flip" is
made
and the DB hands back a URL to the banner in question. At that point
it
records in the database that your banner was served. Now, the end
user may spend 10 seconds at that page or 10 minutes. <<

At that point, I don't care; the user has left the banner and is in
the target URL. I then have to model his URL behavior in a new set of
tables.

But has happened back at the banner, which is what I was modeling? I
hope the banner was there for a duration greater than zero time units.
When my time slot of duration (t1) was used up, can I assume another
banner got a time slot of (t2) in that banner? You do not "machine
gun" banners so fast that they are not readable.

What you are saying is that "the half of the fact" you see is like a
shipping clerk -- packages only leave in his world view. Likewise a
receiving clerk only sees packages arriving in his world view. But
the whole fact is that package makes a trip that takes time until it
arrives someplace (or is declared lost in transit). The correct model
is global, not local.|||"--CELKO--" <jcelko212@.earthlink.net> wrote in message
news:18c7b3c2.0406051347.6cd6a377@.posting.google.c om...
> >> Banners don't have a duration. (Well, active content do, but let's
> come
> back to that.) <<
> Isn't that content and its duration what the buyer is paying for?

No, since duration can't be measured via the web since it's stateless.

> >> we'd charge you probably on number of impressions. I.e the number
> of times it's been served. <<
> I understand billing by the number of hits. But if you can put 1000
> impressions of my ad in banner in one minute, I am not as happy as I
> would be having them spaced out and retained at lest long enough for a
> human being to read. When I record the hits, I record them in a time
> slot -- you can only hit the banner when it is displayed.

Because that's all we can know.

> But has happened back at the banner, which is what I was modeling? I
> hope the banner was there for a duration greater than zero time units.
> When my time slot of duration (t1) was used up, can I assume another
> banner got a time slot of (t2) in that banner? You do not "machine
> gun" banners so fast that they are not readable.

Banners are delivered as often as the clicks the page. If they go to a
page, read the page, walk away and come back 3 weeks later, the same banner
is still displayed. If they go to a page, quickly see it's not what they
want and click a link to another page, then they'll get a different banner.

You may HOPE the banner was there for duration greater than zero time units,
but generally the web doesn't work that way.

> What you are saying is that "the half of the fact" you see is like a
> shipping clerk -- packages only leave in his world view. Likewise a
> receiving clerk only sees packages arriving in his world view. But
> the whole fact is that package makes a trip that takes time until it
> arrives someplace (or is declared lost in transit). The correct model
> is global, not local.

No it's not. That may be what you want it to be, but that's not the way
banner traffic works.|||Greg D. Moore (Strider) (mooregr_deleteth1s@.greenms.com) writes:
> So an end user goes to a page on our site. The code (generally ASP)
> goes to a DB table and queries "what banner should I show." (there may
> be more than one scheduled for say the top position on the page). A
> "coin flip" is made and the DB hands back a URL to the banner in
> question. At that point it records in the database that your banner was
> served.

But, of course, at my end, I am running a banner-filter proxy, so I never
see the bastard anyway. :-)

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp|||>> Banners are delivered as often as the clicks the page. If they go
to a page, read the page, walk away and come back 3 weeks later, the
same banner is still displayed. If they go to a page, quickly see
it's not what they want and click a link to another page, then they'll
get a different banner. <<

I would make a big distiction between the banner (the thing that leads
to my Christmas sale) and the image of the banner that has presisted
for weeks after the holidays in the local storage of a particular
machine. My contract was for a duration (Christmas season) and was
with the website that offered to run my banner. They were to display
from one date to another. If it got clicked (n) times between
December 01 and December 25, then I owe them according to some
formula. Maybe if the click leads to sale between December 01 and
December 25, then I owe them according to another formula. But
outside that (possibly open ended) duration, there is no obligation.

Years ago in Los Angeles, I worked on a data model for a cable TV
shopping network for a major department store chain. This was even
worse because each time a commerical played, we had to compute the
actor's residuals, the assorted union pay rates, and how to credit the
purchase to the nearest local store, and the right department within
that store. Arrgh!|||"--CELKO--" <jcelko212@.earthlink.net> wrote in message
news:18c7b3c2.0406061538.50f3cfbf@.posting.google.c om...
> >> Banners are delivered as often as the clicks the page. If they go
> to a page, read the page, walk away and come back 3 weeks later, the
> same banner is still displayed. If they go to a page, quickly see
> it's not what they want and click a link to another page, then they'll
> get a different banner. <<
> I would make a big distiction between the banner (the thing that leads
> to my Christmas sale) and the image of the banner that has presisted
> for weeks after the holidays in the local storage of a particular
> machine.

You might, but that's irrelevant. I'm discussing what the viewer sees on
the screen.

> My contract was for a duration (Christmas season) and was
> with the website that offered to run my banner. They were to display
> from one date to another. If it got clicked (n) times between
> December 01 and December 25, then I owe them according to some
> formula. Maybe if the click leads to sale between December 01 and
> December 25, then I owe them according to another formula. But
> outside that (possibly open ended) duration, there is no obligation.

You're discussing the contract, I'm discussing the serving, those are two
distinct tiems.

Ask them how long each individual banner was displayed on an end-user's
system (the "duration" as you're referring to it.) They can't do it. (and
if they claim they can, they are most likely fudging some of the data).

> Years ago in Los Angeles, I worked on a data model for a cable TV
> shopping network for a major department store chain. This was even
> worse because each time a commerical played, we had to compute the
> actor's residuals, the assorted union pay rates, and how to credit the
> purchase to the nearest local store, and the right department within
> that store. Arrgh!

Years ago I was involved as a sub-contractor (big mistake) on a system for a
major network who shall remain nameless. System was to handle the
scheduling of commercials on the network and had to deal with all sorts of
items such as regional variations within a broadcast (might show snow-tire
commercials up north while showing a vacation commercial down south.) At
that time I wasn't too involved with the schema, but more so the hardware.
It was an interesting situation trying to deal with handling things like
failures and having to predict the required hardware for a reload (forget
the number of transactions that would have to occur in 10 minutes but at the
time, it was a fairly impressive number. Of course the contractor I was
working for kept saying, "Oh, don't worry, the hardware can handle it." I
was the one that kept saying, "no, we have to model this." (generally I've
got enough faith in "gut instinct" I can model some systems in my head, this
was definitely not one of them.)

Anyway, I just thought of a good distinct here.

Whereas things like commercials have a fixed run time, etc. serving banners
is more like blow-ins in the newspaper. You contract for a specific number
over a specific time, but generally don't care about the exact time the
blow-in is blown-in (and can't model when the reader actually reads it or
tosses it in the trash.)

Yes, the contract has a duration which is critical to model (i.e. when to
start serving, stop serving) but the individual serving is only an
"instant."|||"Greg D. Moore (Strider)" <mooregr_deleteth1s@.greenms.com> wrote in message
news:dS8wc.52769$j24.31693@.twister.nyroc.rr.com...
> "--CELKO--" <jcelko212@.earthlink.net> wrote in message
> news:18c7b3c2.0406041125.3cd714a7@.posting.google.c om...
> > >> Serving them has no duration, no. And we serve (and record the
> > serving) of over 14 million banner ads a day. <<

<snip
> Banners don't have a duration. (Well, active content do, but let's come
> back to that.).
> So, given the criteria you've given, we'd charge you probably (and this
> varies but for this we simplify) on number of impressions. I.e the number
> of times it's been served.
> So an end user goes to a page on our site. The code (generally ASP) goes to
> a DB table and queries "what banner should I show." (there may be more than
> one scheduled for say the top position on the page). A "coin flip" is made
> and the DB hands back a URL to the banner in question. At that point it
> records in the database that your banner was served.

Sounds like a Traffic system to me... ;)
We do the same thing for Television advertising. We should talk...

--
Paul Horan
Sr. Architect
Video Communications, Inc.
www.vcisolutions.com|||Joe,
I think what you're missing in this discussion is the distinction between the ORDER for an impression and the audit
trail of the DELIVERY of that impression.

Both Greg and I work in the Advertising "Traffic" business. Our respective software products are basically Order
Fulfillment processes - my company writes Traffic software for television stations, and Greg's company works in the
internet medium.

The attributes of an Order most definitely DO contain "duration" - both a start_date/end_date duration (so that you
don't run Christmas special ads in February - we call those discrete periods of time a "flight"), as well as two columns
for storing the start/end time within which the spot should "air". Some advertisers only want their spots to be seen in
"prime time", for example...

Our TV clients are also concerned with the duration of the spot itself. Is this a regular :30 second spot? a :15? a
28:30 infomercial? Greg's clients don't have that concept - but I'm sure they do have X/Y and width/height parameters.
Once the HTML is rendered and delivered to the browser, that is officially recorded as one "impression" that happened at
a specific nanosecond in time.

We collect the order "parameters" as attributes of the Order entity itself. Then comes the process of matching up
Orders and available "Inventory", and delivering the content. We print out a daily program log so Master Control knows
which commercials to air in what order, and Greg serves up links to banner ads in "real-time". All that's left is the
recording of that delivery, so that the advertisers (or ad agencies) can pay their bills.

--
Paul Horan[TeamSybase]

"--CELKO--" <jcelko212@.earthlink.net> wrote in message news:18c7b3c2.0406061538.50f3cfbf@.posting.google.c om...
> >> Banners are delivered as often as the clicks the page. If they go
> to a page, read the page, walk away and come back 3 weeks later, the
> same banner is still displayed. If they go to a page, quickly see
> it's not what they want and click a link to another page, then they'll
> get a different banner. <<
> I would make a big distiction between the banner (the thing that leads
> to my Christmas sale) and the image of the banner that has presisted
> for weeks after the holidays in the local storage of a particular
> machine. My contract was for a duration (Christmas season) and was
> with the website that offered to run my banner. They were to display
> from one date to another. If it got clicked (n) times between
> December 01 and December 25, then I owe them according to some
> formula. Maybe if the click leads to sale between December 01 and
> December 25, then I owe them according to another formula. But
> outside that (possibly open ended) duration, there is no obligation.
> Years ago in Los Angeles, I worked on a data model for a cable TV
> shopping network for a major department store chain. This was even
> worse because each time a commerical played, we had to compute the
> actor's residuals, the assorted union pay rates, and how to credit the
> purchase to the nearest local store, and the right department within
> that store. Arrgh!sql sql

Thursday, March 22, 2012

Comparing data in two consecutive rows from a single table

No comments:

Post a Comment

Comma Delimited Fields

Blog Archive

About Me