PeopleSoft Test Framework: Process.Run

The Basics:

Process.Run is a great feature of the PeopleSoft Test Framework.  Using this option wraps the running of a process up into just a few easy steps making the test script a lot easier to maintain.  In this post I use a Object Type dot Action notation which corresponds to how a step would be entered in PTF.  Before the period is the Type field, after is the Action field.  I’ve previously written a post on PTF installation and how to record a simple script, if you need to start from the very beginning.

PTF notation
Pseudo code to run a job looks like this: Continue reading

Oracle posts SES install tips

I’ve been really busy lately and have not been able to find the time to finish up my troubleshooting post on SES yet.  Hopefully I’ll have more time for writing posts again in a few weeks.  I did want to bring attention though to a post I noticed a few weeks ago on the PeopleSoft Technology Blog.   It speaks to the platform support of SES 11.1.2.2, the future support of SES 11.2.2.2 for PeopleTools 8.53, and also provides additional good reasons to keep your SES installs on separate hardware/vm’s than your existing PeopleSoft infrastructure.

Read the full blog post here.

Part 2: PeopleSoft SES Configuration

Previously in Part 1, I wrote about the steps to install Oracle Secure Enterprise Search for use with PeopleSoft and I reviewed some considerations for PeopleSoft Admins.  Part 2 covers the integration configuration required to have PeopleSoft communicate with SES.  On the PeopleSoft side we must setup Integration Broker properly.  On the SES side we need to configure Identity Management for PeopleSoft.

Continue reading

SES for PeopleTools 8.53 Installation

There is a lot of information to cover about Oracle Secure Enterprise Search and PeopleSoft.  As I was writing this I decided it became too much for a single post, so I’ve decided to break it into at least three posts and maybe four:

  • Part 1 – SES Installation:  Installing the SES product for PeopleSoft
  • Part 2 – Integration Configuration:  Configuring PeopleSoft and SES to communicate properly
  • Part 3 – Administration:  How to administer the PeopleSoft/SES functionality
  • Probably Part 4 – PeopleSoft/SES troubleshooting:  Troubleshooting tips and methods for the SES/PeopleSoft relationship

Background:

Oracle Secure Enterprise Search is now the new standard search engine for PeopleSoft with the release of the PeopleSoft 9.2 applications.  SES support was introduced in PeopleTools 8.52 with the new PeopleSoft Search Framework.  You can use SES with older applications, however you may need to create search indexes yourself because only the newest Feature Packs may include delivered indexes, HCM 9.1 FP 2 for example.

The PeopleSoft Search Framework provides a standardized method for creating and maintaining search indexes for PeopleSoft that should be considered an improvement over what was required previously.  Very briefly, the Search Framework Designer is used to create a search definition based on PeopleSoft Query.  In the definition you can implement security, data mapping, categories, and related searches for example.  The search is then deployed to Oracle SES through the Search Framework Administration pages.  You then run an AE which will create your index and publish a feed for Oracle SES and initiate the crawler on the SES side.  All the interaction between PeopleSoft and Oracle SES is via Integration Broker.

Verity is no longer supported for the 9.2 applications or later but is still supported in 8.53 for older applications.  SES is a separate product that can be used to index and search just about anything.  According to documentation no additional licenses are needed to use SES with PeopleSoft, so a limited use licenses is probably wrapped into the standard PeopleTools agreements for 8.52+.  SES includes Weblogic, Oracle Database Server, and the SES Application itself.  From my understanding Oracle does not support the separation of these components and has stated “SES is packaged as a ‘software appliance’ where Oracle database, mid-tier, and the SES application are all tightly integrated and shipped in a single bundle”.

When planning your deployment of SES into your environment consider the following. Continue reading

How I forward displays from Linux or other *nix systems to Windows

First off, I usually try to avoid it, I’m old school and like my command line options.  So, often when I’m installing PeopleSoft components, I use the console options if available.  Sometimes it’s not that simple though so when that happens, I’ve got a pretty standard method of doing things.  So here’s how I forward a display from a Linux server to my Windows workstation. Continue reading

Poll: Readers choose topic for the last week of June

I was playing with WordPress plugins today and found a polling plugin to play with (Yop Poll).  So what better way to check it out than to put up a Poll asking readers to choose a topic that I will post about.  Granted, the Poll is rigged, I did select all the topics after all, and I could have some of these already in the works anyway. 🙂  The Poll closes at the end of 06/23/13.

This poll is closed! Poll activity:
start_date 03-12-2013 19:22:54
end_date 31-12-1969 19:00:00
Poll Results:
What topic should I write about the last week of June?

PeopleSoft TEMP/TMP Directories

In PeopleTools 8.53 Oracle has changed how the TEMP and TMP environment variables are handled.  They are taking the responsibility of setting these out of the Admins hands.  In the past we set these as environment variables perhaps in the shell profile or psconfig.sh.  When we configured the app server / process scheduler these settings would be inherited into the Tuxedo configuration.  With Windows 2008, some were impacted with it’s default of dynamic TEMP/TMP variables based on session and being deleted on logout.  On Windows a previous co-worker of mine got me into the practice of modifying the psappsrv.ubx and prprcs.ubx files and setting these variables there rather than relying on what the user in Windows might have set or trying to adjust them with scripts.  This proved to be helpful in many ways.  Now Oracle has decided to do the same thing by default.

This may impact Admins who are used to having these variables set to something they specifically wanted.  Depending on your setup you may prefer to change these back.  A default psappsrv.ubx file has the following section:

# ————–

*PS_ENVFILE
TEMP={LOGDIR}{FS}tmp
TMP={LOGDIR}{FS}tmp
TM_BOOTTIMEOUT=120
TM_RESTARTSRVTIMEOUT=120

{LOGDIR} is $PS_CFG_HOME/appserv/<DOMAIN>/LOGS and {FS} is the OS specific path delimiter.  So this will use a temp directory in the LOGS directory of each domain such as /opt/apps/psoft/domains/appserv/HCM92/LOGS/tmp.

I’m not sure how much of a fan I am of having TEMP in the logs directory.  I guess just knowing it’s moving is half the battle.  It’s easy to change back if you like.

See Oracle support document [ID 1486978.1] for the announcement.  I don’t recall seeing this in the release notes, but I might have missed it.  They also said it would be back ported to 8.52.16, but you would need to recreate all your domains, not just reconfigure in order to get the change.  Oracle is adding these settings to the domain templates, once you create the new domain from a template TEMP and TMP will be set in the appropriate ubx file.

App Server to Database Reconnection issues in 8.51

I ran into a problem a while ago which brought a more severe problem to my attention.  It appears in at least 8.51.02 (but probably back to 8.51.00) to 8.51.09 there are issues with application server processes properly recovering from a disconnection from the database.  I don’t have first hand experience with that problem, but there is some info on Oracle’s support site about it. If you are running in this PeopleTools range and experiencing odd crashes every once in a while this may be worth investigating.  Using Tracesql=31 will create ORA-3113/3114 errors in your logs.  In 8.51.08 a patch went in to fix it, but it broke something else, causing the problem I encountered.  Bug 11724645 has the details.

My particular problem was experienced on 8.51.09 and was limited to only the Integration Broker PUBSUB processes.  So apparently the PSAPPSRV code had been fixed by then as I never had a problem with those.   In this post I’ll discuss what I saw, some of the troubleshooting steps I used to isolate the problem, and some options I came up with to resolve it.

The Problem:

Integration Broker stops processing messages.   The processes don’t crash and look OK from a quick glance (psr and Process Explorer), but do nothing.  The environment was 8.51.09 all Windows 2008 on SQL Server.  The problem occurred everywhere, even in environments that restarted nightly.

Diagnosing the Problem:

I knew from day one something strange was occurring.  I had never needed to restart the PUBSUB processes this often ever before.  Almost daily some environment would need restarted, sometimes multiple environments, sometimes ones that had already been restarted.  Obviously it was off to the logs first.  There I found something interesting.  Here’s an example, the tables might be different depending on the processes (PSPUBDSP, PSSUBDSP AND PSBRKDSP), but the main message is always the same:  The SELECT permission was denied on the object

PSSUBDSP_dflt.6076 (1) [01/28/12 10:00:25](3) File: E:\pt851-903-R1-retail\peopletools\src\pspubsub\statements.cppSQL error. Stmt #: 67  Error Position: 0  Return: 8601 – [Microsoft][SQL Server Native Client 10.0][SQL Server]The SELECT permission was denied on the object ‘PSAPMSGDSPSTAT’, database ‘HCMDEV’, schema ‘dbo’. (SQLSTATE 42000) 229
Failed SQL stmt:SELECT DSPSTATUS, IB_SLAVEMODE, DSPRESET, CLEANUP_DTTM FROM PSAPMSGDSPSTAT WHERE DISPATCHERNAME=:1 AND MACHINENAME=:2 AND APPSERVER_PATH=:3
PSSUBDSP_dflt.6076 (1) [01/28/12 10:00:25](1) GenMessageBox(200, 0, M): E:\pt851-903-R1-retail\peopletools\src\pspubsub\statements.cpp: A SQL error occurred. Please consult your system log for details.
PSSUBDSP_dflt.6076 (1) [01/28/12 10:00:40](3) File: E:\pt851-903-R1-retail\peopletools\src\pspubsub\statements.cppSQL error. Stmt #: 663  Error Position: 0  Return: 8601 – [Microsoft][SQL Server Native Client 10.0][SQL Server]The SELECT permission was denied on the object ‘PSAPMSGSUBCON’, database ‘HCMDEV’, schema ‘dbo’. (SQLSTATE 42000) 229
Failed SQL stmt:SELECT IBTRANSACTIONID, IB_SEGMENTINDEX, QUEUENAME, IB_OPERATIONNAME, ACTIONNAME, SUBCONSTATUS, PROCESS_INSTANCE FROM PSAPMSGSUBCON WHERE SUBCONSTATUS IN (0,10)AND PROCESS_INSTANCE > 0
PSSUBDSP_dflt.6076 (1) [01/28/12 10:00:40](1) GenMessageBox(200, 0, M): E:\pt851-903-R1-retail\peopletools\src\pspubsub\statements.cpp: A SQL error occurred. Please consult your system log for details.
PSSUBDSP_dflt.6076 (1) [01/28/12 10:00:40](3) File: E:\pt85109b-retail\peopletools\src\psmgr\mgrvers.cppSQL error. Stmt #: 881  Error Position: 0  Return: 8601 – [Microsoft][SQL Server Native Client 10.0][SQL Server]The SELECT permission was denied on the object ‘PSVERSION’, database ‘HCMDEV’, schema ‘dbo’.
[Microsoft][SQL Server Native Client 10.0][SQL Server]The cursor was not declared. (SQLSTATE 37000) 16945
Failed SQL stmt:SELECT VERSION FROM PSVERSION WHERE OBJECTTYPENAME = ‘SYS’
PSSUBDSP_dflt.6076 (1) [01/28/12 10:00:40](1) GenMessageBox(200, 0, M): E:\pt85109b-retail\peopletools\src\psmgr\mgrvers.cpp: A SQL error occurred. Please consult your system log for details.

My first reaction was to check the permissions for the ACCESSID user and of course, nothing was out of the ordinary there.  I searched Oracle support and found a case indicating that I needed to Synchronize the ACCESSID after a Tools upgrade to 8.50+ on SQL Server, but from what I could see the account was setup just fine.  That’s when I took a look at the database connections and saw something bizarre. There was a connection to the database as user people. I waited a minute and looked again, the same connection was still there as people.  Now that shouldn’t happen.  As you should know, the people user is very limited in what it can do.  It’s really only used to validate OPERID’s and retrieve the ACCESSID and password.  I queried sys.dm_exec_sessions, I wanted the host_process_id so I could see what process was connected as people for so long.

login_name session_id login_time              program_name host_process_id status
psaccess   77         2012-01-28 04:16:11.970 PeopleSoft   5624            sleeping
people     81         2012-01-28 04:16:12.030 PeopleSoft   6076            sleeping
psaccess   82         2012-01-28 04:16:12.030 PeopleSoft   5612            sleeping

Once I got the host_process_id, I went back to the app server and confirmed what the log was already telling me.  PID 6076 on the app server was the PSSUBDSP process.  The same one that didn’t have select permission anymore and of course it couldn’t select, it was connected as people.  I also noticed the login_time of the processes seemed odd.  I didn’t restart that system at 04:16 and in fact looking at the processes in Process Explorer indicated the processes had been running quite a while longer.  Now the processes will restart after a certain amount of work load, but that should rarely have them all reconnecting to the database at the same time.  In the logs I didn’t see anything around that time, in fact, the errors didn’t show up until several hours after the login_time.

I restarted the PUBSUB processes and saw that after restarting all processes were once again connected as the ACCESSID.  I decided to see what happened if I killed the connection for the process from the database.  I killed a newly connected PSSUBDSP process by matching the session_id from sys.dm_exec_sessions with the host_process_id again.  The process reconnected as people and never made the switch to the ACCESSID again.

The Test Plan:

I decided to dig in a little deeper.   I was going to file a case with Oracle since I had not found anything on their support site that seemed to address this at all.  I also wanted to have someone else try to replicate it on a different version of PeopleTools.  Plus I didn’t know why the error didn’t really show up until some time later.  To make this section shorter, I did several things to validate my theory (after network/database disconnection the reconnect process was broke) and ensure I had the detail needed for the support case.  I validated outbound port numbers changed with netstat, turned up tracing a bit, and forced messages through.  Something I found during this was that the error would not start showing in the app server logs until a message had tried to be processed.  I ended up coming up with the following test plan to provide to Oracle and others to use:

  1. Set TraceSql=7 in psappsrv.cfg
  2. Review newly created <USER>_PSSUBDSP_dflt.tracesql log to determine pid for PSSUBDSP process
  3. While reviewing the log, verify that after every SQL statement is run transactions are “commited”;  look for a line like:
    PSSUBDSP_dflt.17196 (4)      1-3      16.25.27    0.005000 Cur#20.17196.FINDEV RC=0 Dur=0.003000 Commit
  4. Run netstat -a -o |find “<pid>”  : noting outbound TCP port
  5. Determine SID to kill and the user it’s logged in as
    select session_id, login_name from sys.dm_exec_sessions where host_name=”<App Server HOSTNAME>” and host_process_id=<pid>
  6. kill <sid> :  to kill connection
  7. Wait 15 – 30 seconds
  8. Rerun netstat -a -o |find “<pid>”  : noting outbound TCP port, did it change?  it should have.
  9. Review <USER>_PSSUBDSP_dflt.tracesql log to determine if transactions are now failing; look for a line like
    PSSUBDSP_dflt.17196 (62)      1-3      17.00.24    0.002000 Cur#1.5844.FINDEV RC=0 Dur=0.001000 Rollback
  10. Rerun the SQL above to determine if the login_name has changed
    select session_id, login_name from sys.dm_exec_sessions where host_name=”<App Server HOSTNAME>” and host_process_id=<pid>
  11. If login_name = people or transactions are saying Rollback in the log, you have a problem, review APPSRV log to check for the following errors. None should exist until the next message is processed. PSSUBDSP_dflt.5844 (63) [11/12/12 15:14:56 Dispatch](3) File: E:\pt851-903-R1-retail\peopletools\src\pspubsub\statements.cppSQL error. Stmt #: 67  Error Position: 0  Return: 8601 – [Microsoft][SQL Server Native Client 10.0][SQL Server]The SELECT permission was denied on the object ‘PSAPMSGDSPSTAT’, database ‘FINDEV’, schema ‘dbo’. (SQLSTATE 42000) 229
    Failed SQL stmt:SELECT DSPSTATUS, IB_SLAVEMODE, DSPRESET, CLEANUP_DTTM FROM PSAPMSGDSPSTAT WHERE DISPATCHERNAME=:1 AND MACHINENAME=:2 AND APPSERVER_PATH=:3
    PSSUBDSP_dflt.5844 (63) [11/12/12 15:14:56 Dispatch](1) GenMessageBox(200, 0, M): E:\pt851-903-R1-retail\peopletools\src\pspubsub\statements.cpp: A SQL error occurred. Please consult your system log for details.
  12. Force a message through the system. I was just locking and unlocking my account on the HR side to force the message over to another application.
  13. Check to see if the message was processed and review APPSRV log for error.
  14. Test either Passed: Message proccessed ok, or Failed: Message stuck in New status on subscription side and error in the APPSRV log.
  15. Set TraceSql=0 in psappsrv.cfg
  16. Restart PUBSUB processes to correct any connection problem.

The Fix:

In a test environment I validated that the PUBSUB processes were fixed in 8.51.10, unfortunately, a minor PeopleTools patch was not an option at the time for production, so a work around was in order.  What would be the best way to identify this problem and take corrective action?  If you’ve read some of my other posts you might know blindly restarting every night isn’t my style, and as I saw didn’t guarantee anything.  What ever was causing the disconnect could happen any time, you might need to restart every 30 minutes to ensure a decent availability for integration.  I needed to detect the problem, identify which database on the SQL Server was impacted, and restart only what was impacted.  Time to break out my scripting fingers.  I came up with two scripted solutions.  The first solution I wrote was a SQL script that could be scheduled which would:

  1. Identify connections to the database as people that were older than X minutes
  2. Execute a power shell script on the database server providing  the server name, database, and host PID of the offending process
  3. That database server side PowerShell script would then remotely execute PowerShell commands on the correct app server which
  4. Ensured the host PID provided was for a PUBSUB process and
  5. Executed my normal PUBSUB restart script for the domain

This method had several challenges that I did not try to overcome really.

  1. The database user running the SQL script needs xp_cmdshell.  This is a big security concern in many shops and in general something that should probably be frowned upon.  It would probably be easy enough to have the PowerShell script run the SQL and collect the data as well.  But I didn’t look into it.
  2. My SQL script assumed the app server domain running PUBSUB was the same as the database name.  If you ran multiple or different named domains it would need to be tweaked to take those into account.
  3. Remote PowerShell capabilities had to be turned on, another possible security concern.

I also wrote another Powershell script that was application server side based.  I have a large script infrastructure that is already deployed to any windows app servers that some what mimics psconfig.sh on the nix servers.  This script leverages that infrastructure, and with one additional script, I can monitor all domain APPSRV logs on a server for the error.  This script does the following:

  1. Reads a central file that includes all application domains on the server
  2. Uses pattern matching to find a domain with “select permission was denied” in the log file
  3. If an error is found, the script waits for 20 seconds and rechecks the log file
  4. It compares the number of error lines before and after the 20 seconds (default polling interval is 15 seconds)
  5. If the count increases, it restarts PUBSUB and emails me notification of the restart
  6. If the count is the same, we must have restarted previously, so we do nothing

This method also has some disadvantages, but is the route I chose

  1. It won’t detect the problem until the error is in the log, which could be hours later.  However, there is no user/functional impact until the error arrives as that is the first time a message is actually processed.
  2. I’m not doing anything fancy to track file offsets and start reading where I left off, therefore, the larger the file the longer the script will take to run.  I would not recommend running this against log files that are growing really large or have tracing on.
  3. This script needs to be scheduled on each physical application server instead of a smaller number of database servers.

I never did identify the real culprit of why the processes are being disconnected from the database, it seems to be pretty hit and miss.  I had some ideas, but once I got the automatic restarts scheduled everyone’s interest in the problem died down significantly.

Parallel Pagelet Loading

PeopleTools 8.52 added a performance enhancement for systems that have multiple pagelets on a page.  Previously pagelets were loaded in a sequential fashion, one after another.  With this enhancement the pagelets can be loaded in a parallel fashion allowing your pages to load faster.  Oracle indicates that a page with seven pagelets can reduce it’s loading time by 15%.

With 8.53.01 this setting is delivered enabled, but in 8.52 this feature is not enabled.  In fact, the configuration setting does not even exist.  You have to add it manually.  That was apparently an misstep by the development team according to Oracle support [ID 1503803.1].

To enable this option in 8.52 add the line
parallelLoading=true
to your configuration.properties file which is located at <PS_HOME>/<domain name>/applications/peoplesoft/PORTAL.war/WEB-INF/psftdocs/<site name>/configuration.properties.  Jolt pooling is also required for this setting.

If you deploy lots of pagelets on your homepage you may want to do some performance analysis of how your system will handle peak  times prior to enabling this feature.  Adding lots of pagelets to the homepage, especially in Portal / Interaction HUB, should be done with caution particularly in older versions of PeopleSoft.  I’d really like to post a detailed article on this topic with an analysis of how 8.52 and 8.53 may have improved how this type of configuration responds.  But if I had to sum it up in one sentence it would be this.  While some analysts, developers, or management types may think it’s a great idea to provide data to the user in Portal /  Interaction HUB in the form of pagelets upon logon, it can have a substantial impact on system performance and possibly even bring down your systems.  It needs to be tested thoroughly and under substantial load.  Parallel Pagelet Loading may help alleviate some of the issues I’ve seen in the past, but I will need to test it first hand to know.