The Whole Package Cache – Podcast!

March 25, 2014

Listen to DB2 experts Ian Bjorhovde and Fred Sobotka chat about what’s new in the world of DB2.

pkgcache.com

Its entertaining and informative!

RAHTREETHRESH and Permission denied errors

February 5, 2014

On our DPF environment(DB2 10.1 on Linux) with 195 partitions, non-instance owner userid’s would receive permission denied errors for RAH & db2_all commands.

Permission denied (gssapi-keyex,gssapi-with-mic,publickey,keyboard-interactive).

DB2RSHCMD has been set to SSH, and I did have SSH configured to access without password. An SSH into the host would succeed without a password prompt or any errors.

Upon looking at the IBM documentation, i came across the environment variable $RAHOSTLIST that can be set with the host lists to be used for the RAH commands that would overide reading from db2nodes.cfg

So i try with just a couple of nodes

export RAHOSTLIST=”hostname1 hostname2″

And the RAH command would work without any issues, So i added a few more hosts to the list it would work until 15 servers.

Back to the IBM Info center and realiaze there is another environment variable $RAHTREETHRESH which defaults to 15. Increased it to the number of physical servers in the db2nodes.cfg and it worked without setting the $RAHOSTLIST

And set the $RAHTREETHRESH to total number of partitiones to have db2_all commands work.

More details on the $RAHTREETHRESH Here

As you could see its meant for performance reasons, but it isn’t working as it supposed to be.

Poor Performance On Just One Node

January 12, 2014

Just the LOAD node in the DPF cluster, which is used by the batch jobs to perform LOAD’s was slow compared to the other servers

  • Connects to the database would take over 50secs (Other servers under a second)
  • Queries were running for over 50Secs while the same query will be back in couple of seconds on other servers
  • db2 get dbm cfg would show the same symptom

Just about any db2 commands, even db2pd would show this symptom. While OS related commands work ok

Server resources(CPU,Mmory,SWAP I/O and Network) were not being taxed, nothing in the db2diag.log, notify log or /var/log/messages would give any clue on the situation either.

After having no luck in the DB2 area, network team was involved and their trace showed a traffic to the LDAP server right before i see the commands return. Whatever command i execute is waiting on something for the first 50secs before contacting the LDAP server.

This helped a lot and I started looking at LDAP related files used by DB2 and noticed the /etc/nsswitch.conf file used for name resolution is empty, copying it over from another working server fixed it.

Notes:

FCM AUTOMATIC setting isn’t fast enough?

October 8, 2013

ENVIRONMENT:

DB2 v10.1 FP 2
OS: RHEL 6.4

PROBLEM:

On one of our DPF environment, we noticed a resource intensive job failed exhausting both the FCM Buffers & channels. Would see the following entries in the db2diag.log and also errors about “No FCM Buffer Available”

2013-10-03-05.49.47.787008-300 I518288292E547 LEVEL: Error
PID : 8369 TID : 46968990590720 PROC : db2sysc 21
INSTANCE: db2admn NODE : 021 DB : DBNAME
APPHDL : 901-58904 APPID: 10.96.37.168.38337.131003104954
AUTHID : USER HOSTNAME: hostname
EDUID : 173784 EDUNAME: db2agntp 21
FUNCTION: DB2 UDB, fast comm manager, sqkfChannelManager::AllocChannel, probe:5
RETCODE : ZRC=0x85590029=-2057764823=SQLKF_NO_CHANNEL
“No FCM Channels available”

This happened even though the FCM parameters are set to AUTOMATIC.

db2 get dbm cfg | grep -i fcm
No. of int. communication buffers(4KB)(FCM_NUM_BUFFERS) = AUTOMATIC(8192)
No. of int. communication channels   (FCM_NUM_CHANNELS) = AUTOMATIC(16384)

Did notice messages about DB2 increasing both the values, so the automatic did seem to be working. And didn’t have any limits on INSTANCE_MEMORY and had over 70% of Physical memory available on the server as well.

RESOLUTION:

Increasing the initial value to a higher number along with AUTOMATIC helped get the job completed ok.

UPDATE DBM CFG USING FCM_NUM_BUFFERS 32768 AUTOMATIC
UPDATE DBM CFG USING FCM_NUM_CHANNELS 32768 AUTOMATIC

ROOTCAUSE:

When FCM parameters are set to AUTOMATIC on Linux, it can only be increased 25% above the initial value. Its not clear from the documentation, whether the 25% limit is for an initial increase or only 25% is allowed above the initial value.

Record count doesn’t match the CARD in SYSCAT.TABLES

January 22, 2007

Ran into this interesting issue today. The count on the table showed only about 118Million rows, but the CARD in syscat.tables showed to have about 1.04 Billion rows.

db2 “select count(1) from Schema.tableName”

1
———–
118697936

1 record(s) selected.

db2 “select card from syscat.tables where tabname like ‘tableName'”

CARD
——————–
1048386270

1 record(s) selected.

RUNSTATS was executed just two days back so that wasn’t the issue. I did try updating the stats again, with no luck. Upon digging deeper, realized the data was skewed and the partitioning key chosen wasn’t the best.

Node number 11 alone had about 104Million rows, while the rest of the partitions combined holding the reminder.

db2 “select count(1) from Schema.tableName where nodenumber(partitioning_key) = 11

1
———–
104838627

1 record(s) selected.

Since RUNSTATS on partitioned database will only look at the data in the very first data node and extrapolate the value for CARD. For the 10 partitions, the CARD value it showed made sense.

As a temporary fix: RUNSTATS was executed in the non-skewed node.
Permament fix: Determine a better partitioning key that is more unique to have better distribution

Strange: Group privileges are ignored for CREATE VIEW

December 16, 2006

SQL0551N <user_name> does not have the privilege to perform operation “SELECT” on object
<table_name> <table_name>

If a user receives this error while creating views, make sure the user is granted SELECT privilege individually. Group privileges are not considered for any table or view specified in the CREATE VIEW statement.

No, its not a bug. Its documented 🙂

Update: Similar symptom while trying to INSERT from a Stored Procedure, the user should be granted INSERT cannot just be part of the group that has INSERT access.

Maddocks Systems Success Story – DB2 Express Video

December 9, 2006

Cross-node recovery – TSM

November 3, 2005

To recover/access the backup image of the database that was backed up from a different server. Typically to access production database from dev/qa

Comment out “PASSWORDACCESS = generate” in file “/usr/tivoli/tsm/client/ba/bin/dsm.sys” by adding a ‘*’ in the beginning of the line.

Now we should be able to query any backup image, provided you have the required details.

db2adutl query db <dbname> nodename <nodename> owner <owner> password <passwd>

Update the following db parameters in the target db with source db values from dsm.sys to restore the db backed-up from a different server

  • TSM_NODENAME
  • TSM_OWNER
  • TSM_PASSWORD

Now restore the database with ‘use adsm’ on the target server.

NOTE:

  • Revert back the dsm.sys & db params after the refresh. Else local backups to TSM will fail.

Other References: IBM Documentation

Fly back in time and recover backups! :-)

October 28, 2005



Parallel export for partitioned tables

October 25, 2005

db2_all “<<+$nodenum<db2 connect to $dbname; db2 export to $filename of del select * from $tablename where nodenumber($partitioningkey)=$nodenumber”

Example:

db2_all “<<+11<db2 connect to testdb; db2 export to /backup/test_table.del.11 of del select * from test_table where nodenumber(ID) = 11; db2 terminate”

NOTE:

  • Useful for large tables.
  • Be sure to export all the partitions in which the table is defined.
  • Export each partition to seperate filesystem for better performance.
 
Powered by Wordpress and MySQL. Theme by Shlomi Noach, openark.org