One of our customers has an SQL 2000 (8.00.194) system being updated by 20
remote servers and 5 workstations. As long as the 20 remote servers are
running there does not seem to be a problem. When the 5 workstations start
accessing the data we get the following errors.
I am still awating event log data...
<errors>
Timeout expired. The timeout period elapsed prior to completion of the
operation or the server is not responding.
Transaction (Process ID74) was deadlocked on {lock} resources with another
process and has been chosen as the deadlock victim. Rerun the transaction.
</errors>
I understand in concept why I am getting the errors but the problem is the
SQL server needs to be restarted to fix the errors. Basically neither the
workstations or the remote servers are able to pull data form the SQL server
anymore. Both workstations and sever software is C# under dot NET.
Other things to consider. The server update 1 record from the primary
table and inserts an unknown number of records into 4 other tables. The
transaction must be fully completed or fail. If it fails the data is
written to disk to be attempted at a future time.
The workstation is receiving the error when it reads from the primary table
using a filter on the primary key. There is no transaction since I assume
on error it can be re-read. ( Would it help to put this in a transaction, I
was assuming less load not too ) During design I expected the error (not
deadlock but timeout) but also assumed that the user could just re-read the
data, any clue why they SQL server has to be restarted? Note the server
are also looking for the record to be updated using the primary key. Nether
the remote servers, workstations or the software run on them needs
restarting.
In concept no two servers will ever try to update the same primary record
but it is remotely possible due to the data caching on failure. The
workstation does update the primary record but mostly it is a reading when
the error occurs.
Suggestions?
Comments?
Regards,
John
Hi John
"John J. Hughes II" wrote:
> One of our customers has an SQL 2000 (8.00.194) system being updated by 20
> remote servers and 5 workstations.
This is a very old version of SQL Server 2000, you may want to considering
bringing it up to date.
> As long as the 20 remote servers are
> running there does not seem to be a problem. When the 5 workstations start
> accessing the data we get the following errors.
> I am still awating event log data...
> <errors>
> Timeout expired. The timeout period elapsed prior to completion of the
> operation or the server is not responding.
> Transaction (Process ID74) was deadlocked on {lock} resources with another
> process and has been chosen as the deadlock victim. Rerun the transaction.
> </errors>
> I understand in concept why I am getting the errors but the problem is the
> SQL server needs to be restarted to fix the errors. Basically neither the
> workstations or the remote servers are able to pull data form the SQL server
> anymore. Both workstations and sever software is C# under dot NET.
> Other things to consider. The server update 1 record from the primary
> table and inserts an unknown number of records into 4 other tables. The
> transaction must be fully completed or fail. If it fails the data is
> written to disk to be attempted at a future time.
Do you have any maintenance tasks running on the server to update the indexes?
Have you checked the disc fragmentation and space available?
> The workstation is receiving the error when it reads from the primary table
> using a filter on the primary key. There is no transaction since I assume
> on error it can be re-read. ( Would it help to put this in a transaction, I
> was assuming less load not too ) During design I expected the error (not
> deadlock but timeout) but also assumed that the user could just re-read the
> data, any clue why they SQL server has to be restarted? Note the server
> are also looking for the record to be updated using the primary key. Nether
> the remote servers, workstations or the software run on them needs
> restarting.
Running sp_who2 to see what is blocking or for more detailed information
look at http://support.microsoft.com/default.aspx/kb/271509. Blocking can
occur on different things and at different levels. You don't say if there are
any processes other than yours that access the system.
> In concept no two servers will ever try to update the same primary record
> but it is remotely possible due to the data caching on failure. The
> workstation does update the primary record but mostly it is a reading when
> the error occurs.
You may want to run SQL Profiler to see the activity on the system.
> Suggestions?
> Comments?
> Regards,
> John
>
John
|||Hello John,
In addition to the below recomendations you can also trace the deadlock in
SQL Server error log by enabling the trace flag 1204 & 3205.
Both the trace will help you to finetune your deadlock problem and put you
more closer to the error.
But before enabling trace flag I highly recomend to install SP4 for SQL 2000
and run DBCC command to check database consistancy & tune indexes.
NOTE: BE CAREFUL WHILE RUNNING "DBCC" ON PRODUCTION IT MIGHT HAVE BLOCKING
PROBLEM. RUN DBCC DURING OFFPICK HRs.
Good Luck!
MB
"John Bell" <jbellnewsposts@.hotmail.com> wrote in message
news:F81CEDB8-2297-44FE-A06F-B6D87AE4C1C1@.microsoft.com...
> Hi John
> "John J. Hughes II" wrote:
> This is a very old version of SQL Server 2000, you may want to considering
> bringing it up to date.
>
> Do you have any maintenance tasks running on the server to update the
> indexes?
> Have you checked the disc fragmentation and space available?
>
> Running sp_who2 to see what is blocking or for more detailed information
> look at http://support.microsoft.com/default.aspx/kb/271509. Blocking can
> occur on different things and at different levels. You don't say if there
> are
> any processes other than yours that access the system.
>
> You may want to run SQL Profiler to see the activity on the system.
> John
|||Thank you both for your suggestions, I will see what I can get them to do.
I have been trying to get them to update to any service pack for years now,
will try again.
Regards,
John
"MB" <MB@.MB.com> wrote in message
news:u4wb$3zLHHA.2456@.TK2MSFTNGP06.phx.gbl...
> Hello John,
> In addition to the below recomendations you can also trace the deadlock in
> SQL Server error log by enabling the trace flag 1204 & 3205.
> Both the trace will help you to finetune your deadlock problem and put you
> more closer to the error.
> But before enabling trace flag I highly recomend to install SP4 for SQL
> 2000 and run DBCC command to check database consistancy & tune indexes.
> NOTE: BE CAREFUL WHILE RUNNING "DBCC" ON PRODUCTION IT MIGHT HAVE BLOCKING
> PROBLEM. RUN DBCC DURING OFFPICK HRs.
> Good Luck!
> MB
>
> "John Bell" <jbellnewsposts@.hotmail.com> wrote in message
> news:F81CEDB8-2297-44FE-A06F-B6D87AE4C1C1@.microsoft.com...
>
|||Ok I have done a couple things. First I have gotten them to install SP4
and check the database with DBCC and rebuild the index files but the problem
is still occuring.
They locked again tonight and I was able to do a little testing before they
restarted the SQL server. I noted that where they were not able to pull up
the data they wanted they were able to pull data from other tables so the
SQL server was not as stuck as they said.
In viewing the sp_who2 data it look pretty much the same as when the system
was running normally, I did not see anything abnormal. Clue what I should
look for?
Also where they could not load the data in my application which uses a
simple select and join in stored procedure I was able to pull the same data
using a query.
I am thinking there is a lock someplace that is not being released but when
I view the active locks on either the normal running system or the system
when it has a problem there is about a 100+ of them so is there a way to
determine if a lock is not being released? In both the activity monitor
and the profiler there is just too much data to see a stuck lock.
Also when connecting with the profiler the system when the problem occurs is
locking and releasing locks but it is also showing a large number of
timeouts. There are also a large number of dead locks. If they don't use
the stored procedure to pull the data in my application at the workstation
the other server were able to run for 3 days without events.
Suggestions?
Stored procedure that seems to be causing the problem: 16K records in
Directories and maybe 5 records in properties.
ALTER PROCEDURE [dbo].[SelectPayphone]
(
@.Directory nvarchar(100) = NULL,
@.ServerName nvarchar(100) = NULL,
@.AreaCode varchar(20) = NULL,
@.Number varchar(20) = NULL
)
AS
SET NOCOUNT ON;
SELECT (CAST(D.CoinBoxEquValue AS DECIMAL) / CAST(P.CoinBoxFull AS
DECIMAL)) * 100 AS PercentFull, P.CoinBoxFull, D.*
FROM Directories AS D JOIN Properties AS P ON D.Properties = P.Name
WHERE (
((@.Directory IS NOT NULL) AND ([Directory] = @.Directory)) OR ((@.Directory
IS NULL) AND ISNULL([Directory], '') >= ''))
AND
(((@.ServerName IS NOT NULL) AND ([ServerName] = @.ServerName)) OR
((@.ServerName IS NULL) AND (ISNULL([ServerName], '') >= ''))
AND
((@.AreaCode IS NOT NULL) AND (AreaCode LIKE @.AreaCode) OR (@.AreaCode IS
NULL))
AND
((@.Number IS NOT NULL) AND (Number LIKE @.Number) OR (@.Number IS NULL))
)
ORDER BY AreaCode, Number;
Regards,
John
"John Bell" <jbellnewsposts@.hotmail.com> wrote in message
news:F81CEDB8-2297-44FE-A06F-B6D87AE4C1C1@.microsoft.com...
> Hi John
> "John J. Hughes II" wrote:
> This is a very old version of SQL Server 2000, you may want to considering
> bringing it up to date.
>
> Do you have any maintenance tasks running on the server to update the
> indexes?
> Have you checked the disc fragmentation and space available?
>
> Running sp_who2 to see what is blocking or for more detailed information
> look at http://support.microsoft.com/default.aspx/kb/271509. Blocking can
> occur on different things and at different levels. You don't say if there
> are
> any processes other than yours that access the system.
>
> You may want to run SQL Profiler to see the activity on the system.
> John
|||Hi John
"John J. Hughes II" wrote:
> They locked again tonight and I was able to do a little testing before they
> restarted the SQL server. I noted that where they were not able to pull up
> the data they wanted they were able to pull data from other tables so the
> SQL server was not as stuck as they said.
This does sound like something specific is being blocked
> In viewing the sp_who2 data it look pretty much the same as when the system
> was running normally, I did not see anything abnormal. Clue what I should
> look for?
The BLK column should be blank, if it contains a value then that is the
process which is blocking. The procedure sp_blocker_pss80 as defined in the
link I previously supplied provides more detailed blocking information see
http://support.microsoft.com/default.aspx/kb/271509
> Also when connecting with the profiler the system when the problem occurs is
> locking and releasing locks but it is also showing a large number of
> timeouts.
Lock timeout may be an indication of a missing index or an index that is not
being used.
> There are also a large number of dead locks. If they don't use
> the stored procedure to pull the data in my application at the workstation
> the other server were able to run for 3 days without events.
>
Other server?
> Suggestions?
> Stored procedure that seems to be causing the problem: 16K records in
> Directories and maybe 5 records in properties.
> ALTER PROCEDURE [dbo].[SelectPayphone]
> (
> @.Directory nvarchar(100) = NULL,
> @.ServerName nvarchar(100) = NULL,
> @.AreaCode varchar(20) = NULL,
> @.Number varchar(20) = NULL
> )
> AS
> SET NOCOUNT ON;
> SELECT (CAST(D.CoinBoxEquValue AS DECIMAL) / CAST(P.CoinBoxFull AS
> DECIMAL)) * 100 AS PercentFull, P.CoinBoxFull, D.*
> FROM Directories AS D JOIN Properties AS P ON D.Properties = P.Name
> WHERE (
> ((@.Directory IS NOT NULL) AND ([Directory] = @.Directory)) OR ((@.Directory
> IS NULL) AND ISNULL([Directory], '') >= ''))
> AND
> (((@.ServerName IS NOT NULL) AND ([ServerName] = @.ServerName)) OR
> ((@.ServerName IS NULL) AND (ISNULL([ServerName], '') >= ''))
> AND
> ((@.AreaCode IS NOT NULL) AND (AreaCode LIKE @.AreaCode) OR (@.AreaCode IS
> NULL))
> AND
> ((@.Number IS NOT NULL) AND (Number LIKE @.Number) OR (@.Number IS NULL))
> )
> ORDER BY AreaCode, Number;
> Regards,
> John
I would suspect the bottleneck to be the directories table.
Production code should never contain SELECT *, only specify the columns
that you require. In query analyser you can drag the column names into the
query window. If you drag the columns branch itself then you will get all the
column names. You have not specified table aliases in the where clause. It is
good practice to so to avoid conflict if the query changes. Decimal datatypes
default to a precision of 38 and scale of 0, if you are submitting decimal
values then you may get spurious results e.g.
SELECT CAST(10 as decimal)/CAST(2.5 as decimal),
CAST(10 as decimal)/2.5,
10/CAST(2.5 as decimal),
CAST(10 as decimal(38,0))/2.5
Your where clause can probably be simplified to:
WHERE (D.[Directory] = @.Directory OR @.Directory IS NULL)
AND (D.[ServerName] = @.ServerName OR @.ServerName IS NULL)
AND (D.[AreaCode] = @.AreaCode OR @.AreaCode IS NULL)
AND (D.[Number] = @.Number OR @.Number IS NULL)
ORDER BY D.AreaCode, D.Number;
This is assuming that columns used are all in the Directories table. If you
look at the query plan you may see that it does not use the selective indexes
and does not perform very well (especially on large volumes of data). You
should read http://www.sommarskog.se/dyn-search.html and test some of the
alternative solutions.
John
|||Thanks again for the education...
"John Bell" <jbellnewsposts@.hotmail.com> wrote in message
news:89CE302F-9363-4F0B-85A9-E6C18FE97BCB@.microsoft.com...
> Hi John
> "John J. Hughes II" wrote:
>
> This does sound like something specific is being blocked
> The BLK column should be blank, if it contains a value then that is the
> process which is blocking. The procedure sp_blocker_pss80 as defined in
> the
> link I previously supplied provides more detailed blocking information see
> http://support.microsoft.com/default.aspx/kb/271509
Ok I have run this on both systems and I don't see a "BLK" column, there is
a "Blocking" column which on my local test system is always 0 being as
nothing is happening there. On the SQL 2000 system the "Blocking" column =
a number and constantly changes.
> Lock timeout may be an indication of a missing index or an index that is
> not
> being used.
Per the tuning engine with no changes the SP is using PhoneNumbers (primary
directory key) and the property index (link between directories and
properties). I tried adding to my local system a new index which uses the
4 columns I am searching (servername, directory, areacode, number) but the
tuning engine still continues to use the primary key. I would think my new
index would be a better match and allow the system to use a different index
for this search thus removing the load from the first key if locked?
This is on my SQL 2005 system which is no doing much, not sure if it would
change on the SQL 2000 system under load.
Based on the below reading it seems I might have to change to dynamic SQL...
hum, will test.
> Other server?
There is one DB server (SQL 2000) and 20 remote server (no local database)
connected to the DB server. The 20 remote servers are using the DB server
database. Each remote server queries and updates data constantly.
>
> I would suspect the bottleneck to be the directories table.
> Production code should never contain SELECT *, only specify the columns
> that you require. In query analyser you can drag the column names into the
> query window. If you drag the columns branch itself then you will get all
> the
> column names. You have not specified table aliases in the where clause. It
> is
> good practice to so to avoid conflict if the query changes. Decimal
> datatypes
> default to a precision of 38 and scale of 0, if you are submitting decimal
> values then you may get spurious results e.g.
> SELECT CAST(10 as decimal)/CAST(2.5 as decimal),
> CAST(10 as decimal)/2.5,
> 10/CAST(2.5 as decimal),
> CAST(10 as decimal(38,0))/2.5
> Your where clause can probably be simplified to:
> WHERE (D.[Directory] = @.Directory OR @.Directory IS NULL)
> AND (D.[ServerName] = @.ServerName OR @.ServerName IS NULL)
> AND (D.[AreaCode] = @.AreaCode OR @.AreaCode IS NULL)
> AND (D.[Number] = @.Number OR @.Number IS NULL)
> ORDER BY D.AreaCode, D.Number;
> This is assuming that columns used are all in the Directories table. If
> you
> look at the query plan you may see that it does not use the selective
> indexes
> and does not perform very well (especially on large volumes of data). You
> should read http://www.sommarskog.se/dyn-search.html and test some of the
> alternative solutions.
Thanks for the suggestion.
> John
>
|||Hi John
> Ok I have run this on both systems and I don't see a "BLK" column, there is
> a "Blocking" column which on my local test system is always 0 being as
> nothing is happening there. On the SQL 2000 system the "Blocking" column =
> a number and constantly changes.
>
The number is the SID of the process which is blocking, you can then look
that up to see if that SID is also blocked until you get the the root SID
that is causing the problem. sp_blocker_pss80 does this for you.
>
> Per the tuning engine with no changes the SP is using PhoneNumbers (primary
> directory key) and the property index (link between directories and
> properties). I tried adding to my local system a new index which uses the
> 4 columns I am searching (servername, directory, areacode, number) but the
> tuning engine still continues to use the primary key. I would think my new
> index would be a better match and allow the system to use a different index
> for this search thus removing the load from the first key if locked?
> This is on my SQL 2005 system which is no doing much, not sure if it would
> change on the SQL 2000 system under load.
I am not sure that the tuning engine is in this context, in SSMS choose the
Include Actual Execution Plan option on the Query menu (Ctrl-M) to see the
actual query plan, in Query Analyser there is a similar option.
I assume this is a clustered index and it is doing a scan, which basically
means it is sequentually processing each record in the table.
> Based on the below reading it seems I might have to change to dynamic SQL...
> hum, will test.
>
> There is one DB server (SQL 2000) and 20 remote server (no local database)
> connected to the DB server. The 20 remote servers are using the DB server
> database. Each remote server queries and updates data constantly.
>
Both the database server and any system connecting to is should run without
the need for a reboot for as long as required!
John
|||Thanks again...
"John Bell" <jbellnewsposts@.hotmail.com> wrote in message
news:AACFE723-91B3-4B2F-9665-7B6810770325@.microsoft.com...
> Hi John
>
> The number is the SID of the process which is blocking, you can then look
> that up to see if that SID is also blocked until you get the the root SID
> that is causing the problem. sp_blocker_pss80 does this for you.
> I am not sure that the tuning engine is in this context, in SSMS choose
> the
> Include Actual Execution Plan option on the Query menu (Ctrl-M) to see the
> actual query plan, in Query Analyser there is a similar option.
> I assume this is a clustered index and it is doing a scan, which basically
> means it is sequentually processing each record in the table.
The PhoneNumber key is clustered and the primary key set to unique on
areacode and number.
The tuning engine seems useless for dynamic SQL. The SSMS with AEP does
show what is happening but does not seem to tell me which index is being
used, only that a clustered index seek is being done for properties and
directories. Depending on how many values I specify the percentage changes
between the two where all values give me 50%/50% and no values give me
58%/41% which I am not sure is helpful. I created a test db on my SQL 2005
server with 50K rows and the result are either very slow because of transfer
time (the first result appears before the time hits 00:00:01) or there is no
measurable time.
By the way is there a setting in SQL 2000 where I can tell the server is a
lock is held over say 2 minutes to kill it?
> Both the database server and any system connecting to is should run
> without
> the need for a reboot for as long as required!
Yes which they were doing and are not now, other system are not having this
problem... Have to fix sooner then later.
Regards,
John
|||Hi John
> The PhoneNumber key is clustered and the primary key set to unique on
> areacode and number.
> The tuning engine seems useless for dynamic SQL.
If you have followed Erlands article, you will be probably be sending
different statements to the database depending on what values you pass to the
procedure. The query execution plan should change for different values being
passed.
> The SSMS with AEP does
> show what is happening but does not seem to tell me which index is being
> used, only that a clustered index seek is being done for properties and
> directories. Depending on how many values I specify the percentage changes
> between the two where all values give me 50%/50% and no values give me
> 58%/41% which I am not sure is helpful. I created a test db on my SQL 2005
> server with 50K rows and the result are either very slow because of transfer
> time (the first result appears before the time hits 00:00:01) or there is no
> measurable time.
You have to remember that the live system is SQL 2000 and the query engine
for SQL 2005 is significantly different. As your system was suffering
blocking, then the aim for re-writing the procedure is to make sure that it
is using indexes and being more efficient, you will only see if this has
really worked by testing in an environment similar to live.
It is not clear from what has been posted what is the actual cause of the
blocking. Did you run sp_blocker_pss80 and has it told you what the blocking
process is doing? You should also look at how you are updating the tables and
making sure that you don't have transactions that are too long. Are there any
triggers on these tables?
> By the way is there a setting in SQL 2000 where I can tell the server is a
> lock is held over say 2 minutes to kill it?
You can give a maximum cost for a query and not execute it, but that will
not be any good in a blocking situation or for your application. You may want
to post DDL and example data in which case check out
http://www.aspfaq.com/etiquette.asp?id=5006
HTH
John
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment