Surviving A Month of Dead Connection Detection Zombie Land!

Timeline

August 11th – Switched Over using a Physical Standby on new hardware for our ERP software in about 15 minutes. This was a manual SQLPLUS command-line method because of issues with using DataGuard to complete this task without hanging or connection-related issues (that never show up until switchover time). We are ecstatic at this point because we also implemented huge pages on Linux RH 64bit , resized the SGA, encrypted filesystems , DB_ULTRA_SAFE parameter, 10GB NIC and implemented ASMM all at the same time.  Life couldn’t be better. This is Oracle 11.2.0.3 …and all SQLNET traffic is encrypted using Native Network Encryption….keep all of this in mind as you follow along on this horror story!

August 15th – Network guys updated Cisco switches and routers with software/firmware upgrades with planned downtime. Life seems normal up to this point.

August 16 th – As DBA I scan the alertlogs every 15 minutes for any of the following keywords: ORA and/or WARNING

I start getting complaints from endusers that our ERP is slow, freezing, etc. At this point we suspect the switchover to new hardware.  But we had made so many changes which one could it be?  I see a 10% increase in waits related to the redo logs (log file parallel write). So I moved the redo logs to unencrypted filesystem area….made no difference, moved them back.   I could reset the DB_ULTRA_SAFE to improve performance but that requires a database cycle.

By the way when mentioning encrypted native filesystems…this is Oracle’s answer to a Support ticket asking about compatibility.

“This is a 3th party issue, we have our own solution which would be TDE tablespace encryption,
for any 3th party solution to properly work, it must be completely transparent to oracle,
the normal read / write OS calls oracle does must be redirected to the decrypt / encrypt code, it
is possible asynch_io can no longer work and you also may need to set parameter disk_asynch_io = false,
otherwise it is entirely up to the 3th party product being tested and certified to run with oracle
by the 3th party vendor.”

As the university started in full work mode for our FALL semester…performance issues worsened.  So I started looking in the alert logs for other errors I wasn’t capturing….now I start seeing a LOT of the 12170 ….so I modify my script to send those by email when they occur by adding the keyword FATAL .

Errors started to come in groups  I had seen the occasional ORA-3136 which I knew to ignore as an error related to logging in.
WARNING: inbound connection timed out (ORA-3136)
Fatal NI connect error 12170.

August 21 – Sept 5 – This is where I dread coming to work every day. I am spending 10 hours or so trying to figure out what is wrong, my ulcer also kicks into high gear.

Oracle Enterprise Graphs are looking unusual…..huge NETWORK waits especially for jobs that connect to other databases using database links.  See the following for how it appeared to me. BAD, BAD, BAD…..for everyone. So now the guessing goes into full swing…and I mean GUESSING because we don’t have much to go except lots of spurious ORA errors.  By the way this is an Oracle Forms App….so we don’t have expensive tools to trace sessions all the way back to the database. There is some functionality for tracing but they only indicated the hanging/freezing sessions were gone from the database after a point in time, but what disconnected them?

NetworkZombie

I surmised the freezing that people experienced with this FORMS app was due to the 30 network retries we have set on the FORMS configuration. We have been using this parameter for many years which attempts a reconnection for at least 30 times before giving up. So…this is the freezing as it takes time to do this 30 times. People end up closing their browser and that generates the 12170 errors.

I notice some patterns – these 12170 errors come all at once in batches through the alertlogs. I finally figured out to find the smallest error number is the important one – 110 .

Fatal NI connect error 12170.

VERSION INFORMATION:
TNS for Linux: Version 11.2.0.3.0 – Production
Oracle Bequeath NT Protocol Adapter for Linux: Version 11.2.0.3.0 – Production
TCP/IP NT Protocol Adapter for Linux: Version 11.2.0.3.0 – Production
Time: 29-AUG-2014 18:14:10
Tracing not turned on.
Tns error struct:
ns main err code: 12535

TNS-12535: TNS:operation timed out
ns secondary err code: 12560
nt main err code: 505

TNS-00505: Operation timed out
nt secondary err code: 110
nt OS err code: 0
Client address: (ADDRESS=

See MOS Note

Alert Log Errors: 12170 TNS-12535/TNS-00505: Operation Timed Out (Doc ID 1628949.1)

We are pretty sure this is network-related/OS-related but how do we convince the Network guys something is wrong.  I am digging through MOS finding everything I can find on tracing and the like.  I start tracing on the database server generating huge amounts of logs (50 GB at a time). Trying to find error numbers that I haven’t seen.  I see a lot of the following and if you search through MOS you can find information on those errors.

[16-SEP-2014 18:11:01:606] nsfull_pkt_rcv: error exit
[16-SEP-2014 18:11:01:606] nioqer: entry
[16-SEP-2014 18:11:01:606] nioqer:  incoming err = 12151
[16-SEP-2014 18:11:01:606] nioqce: entry
[16-SEP-2014 18:11:01:606] nioqce: exit
[16-SEP-2014 18:11:01:606] nioqer:  returning err = 3113

Finally I find a document that mentions the 12151 and 3113 are really just spurious and not the real cause of our problems. Tracing on the database side didn’t really help….basically we determined the sessions were gone at that point and the traces just verified this…Oracle didn’t know what happened to them so the lack of information should be speaking volumes at this point.

Talking with network guys throughout all of this ….they can not find anything wrong.

We request them to turn off sqlnet packet inspection as per this MOS Doc:

Troubleshooting guide for ORA-12592 / TNS-12592: TNS: bad packet (Doc ID 373431.1)

Still the problems persist….what is wrong?  We are so desperate at this point we start second-guessing everything that was done during the switchover. And this is where we start heading down the road of too many mods! Guess what….the following list shows what DIDN’T HELP.

1. Switched back to the 1GB NIC

2. Switched hardware load balancers

3. Upgraded the database to 11.2.0.4 because of an Oracle bug

10096945 Waits using DBLINKS & nested loops

We had to install two more patches on top of 11.2.0.4 (plus CPU) to get rid of some issues associated with that version which produced ORA-904 errors…turning off query rewrite fixed one of the problems, flushed the shared pool.

17956707 ORA-904 executing SQL over a database link 19/May/2014
17551261 ORA-904 “from$_subquery$_003″.<column_name> with query rewrite 21/Feb/2014

4. Moved to 12.1.0.2 listener

5. Installed p17956707_112042_Linux-x86-64.zip patch to fix ORA-904 as a result of 11.2.0.4
Reinstalled recreatectxsyssyncrn.sql – bug in 11.2.0.4
6. Installed/ran OSWatcher on the database server. How does one even start to understand what is produced especially for the network stats.

7.  Added USE_NS_PROBES_FOR_DCD=TRUE in sqlnet.ora (this reverts to a 11g type of Dead Connection Detection)
8. Enabled a job to restart services on a different application that was losing its connections as well.
9. Removed one of the three Oracle Forms/Reports Servers from the load balancer….plan was to wipe it, reinstall to a fully patched version for redeployment.

10. Ran RDA against the forms server, using it for a contact on this issue. Lots of disconnects showing up in the logs.

11. Was it a recent JAVA desktop update…., was it the Java version on Weblogic?

12. Results of traces on the database
Lots of 12151 and 3113 errors which are spurious in nature
ora-12547 in  server_45148.trc
MOS Notes 1104673.1 , 1591874.1 & 1300824.1, 1531223.1,  461053.1

13. Recompiled all of the forms, rewrote bad application code

14. Reconfigured/rebooted tweaked all settings on the LOAD BALANCER

15.  OK this was the BAD thing we did….modified the Linux Operating System TCP keep alive parameters on the database host.  Haven’t ever done this before….didn’t know what we were doing. Upped it to two hours …and then upped it again to over 8 hours assuming that was giving a session exclusive access for that time period.

About this time….the network guys did realize the Network Intrusion System was listening on the connections between the servers….not the outside traffic, protected/firewalled interior traffic.  Oops….huge bottleneck because it was inadequately sized.  So they reconfigured it.

Life seemed OK….it was better in some respects. Some of our other applications that connected to the same database turned back to normal activity but the FORMS app was still choking.

Well my wonderful boss finally decided to start a SQLPLUS trace session from different vantage points to set what happened to try and determine if any kind of session was affected.

1. SQLPLUS from our desktops

2. SQLDEVELOPER from our desktops

3. SQPLUS from a server in the same subnet

4. SQLPLUS from other servers in different subnet

5. SQLPLUS from the app server different subnet not on the load balancer

6. SQLPLUS from the app servers still on the load balancer

Connected to the database in question and started the wait…..we realized the disconnects seemed to happen somewhere between one hour and two hours.  Isn’t that Network? Not necessarily…..we were seeings a ORA-3135 which is a completely different error number than anything I had seen.

Finally I get an error number that helps with searching on MOS, I start finding a lot of better information as it applies to Dead Connection Detection.  Due to the tracing I knew our DCD was in place and working!

[16-SEP-2014 15:24:00:384] niotns: Enabling CTO, value=180000 (milliseconds)
[16-SEP-2014 15:24:00:384] niotns: Enabling dead connection detection (10 min)
[16-SEP-2014 15:24:00:384] niotns:  listener bequeathed shadow coming to life…
[16-SEP-2014 15:24:00:384] nsinherit: entry

Ora-03135: Connection Lost Contact After Idle Time (Doc ID 1380261.1)
Troubleshooting ORA-3135/ORA-3136 Connection Timeouts Errors – Database Diagnostics (Doc ID 730066.1)
Troubleshooting ORA-3135 Connection Lost Contact (Doc ID 787354.1)

” Idle Connection Timeout

The most frequently occurring reason for this error is due to a Max Idle Time setting at the firewall.
If the client traverses the firewall to get to the server and is being terminated abruptly, this is likely the cause.

A relatively simple test to determine if this is a Firewall maximum idle time issue:

At the offending client, establish a SQL*Plus or OCI client connection to the server.

SQL*Plus username@TNS connect string
password:

SQL>  <===Allow this client connection to sit idle for an hour.

Return after the hour is up and issue a simple query:

SQL>select * from dual;

If this connection is terminated with ANY error it is likely that your firewall will not allow a connection to remain idle for a lengthy period of time.

It is possible to trick the firewall with Dead Connection Detection packets in order to keep the connection alive. 

WHAT!…..I HAD DCD in place all along….it is still saying firewall but read on as there is a gotcha or caveat to DCD.

“PLEASE NOTE:
 DCD was never designed to be used as a “virtual traffic generator” as we are wanting to use it for here. This is merely a useful side-effect of the feature.
In fact, some later firewalls and updated firewall firmware may not see DCD packets as a valid traffic possibly because the packets that DCD sends are actually empty packets. Therefore, DCD may not work as expected and the firewall / switch may still terminate TCP sockets that are idle for the period monitored, even when DCD is enabled and working.
In such cases, the firewall timeout should be increased or users should not leave the application idle for longer than the idle time out configured on the firewall.”

At this point we are completely discouraged as we believe that there is no fix….we could implement PROFILES setting idle timeout on the database so people would have to relogin every hour..at least that would stop the freezing due to 30 network retries.

Finally the last clue we needed to figure out how to get rid of the ZOMBIES…….as part of a MOS document.

“The firewalls inactivity timer can be disabled for the affected host.

The host OS keep alive setting (tcp_keep_alive) can be modified to be less than the firewall inactivity timeout. This will cause the OS to send a test packet to the client when the timeout is reached and the client will respond with an ACK. To all intents and purposes this is the same as turning off the firewall inactivity timer for this host.”

We had reset the tcp_keep_alive to higher than the default setting of 1 hour for the firewall during all of the mods we tried…so we broke it while trying to fix the original connection issue that the IPS caused. Applications that only connected for a few seconds weren’t affected but all apps that required a sticky session for more than one hour were…

The horror story is now over…but I still have an ulcer, hopefully that will heal in time.  Now on to the database upgrade to 12.1.0.2 ….as I see that DCD is handled completely different in that one…so I expect to experience this ZOMBIE stuff again. But at least I have more knowledge that what I started with.

So the final conclusion is that the network/switches/router upgrades that happened way back when this started… no longer recognizes Oracle’s DCD packets (they are zero length) but it does recognize the OS packets for keep alive (non zero length).

Errors seen when firewall is blocking connections – rule-based
Fatal NI connect error 12514, connecting to:
Fatal NI connect error 12514, connecting to:
Fatal NI connect error 12514, connecting to:

Posted in Uncategorized | 3 Comments

Deinstalling an ORACLE_HOME in 11gR2 DB = MORE WORK!

This is a post about using the deinstall script (it is a perl script on UNIX machines) without some testing on a non-production server. Why? It has the ability remove important database components that may still be needed for that server. Especially if you haven’t implemented database component locations as part of the Oracle Flexible Architecture (OFA) method.

I typically start up the listener in the highest-upper-level $ORACLE_HOME to service all databases for a particular server. Running the ORACLE-provided deinstall on an old decommissioned $ORACLE_HOME removed all listeners in the production $ORACLE_HOME. During the run of the deinstall script there didn’t seem to be any way around this task, I was unable to leave it blank or write in a wrong answer. So I created a bogus listener that was never going to be used for the script to remove. MORE WORK!

Specify all Single Instance listeners that are to be de-configured [LISTENER1,LISTENER2]: none

Invalid listener list [ none]. You can only specify a subset of the configured listeners.

At least one listener from the discovered listener list [LISTENER1, LISTENER2] is missing in the specified listener list [LISTENER1]. 
The Oracle home will be cleaned up, so all the listeners will not be available after deinstall. 
If you want to remove a specific listener, please use Oracle Net Configuration Assistant instead. Do you want to continue? (y|n) [n]:

Basically it requests to remove all LISTENERS (they are automatically discovered during the run) and you have to choose at least one to remove.

At least it didn’t try to remove databases that weren’t in this ORACLE_HOME.

Specify the list of database names that are configured in this Oracle home []:

Apparently there are a lot of problems (ie bugs) with the deinstall scripts so Oracle recommends not to use them. The following note verifies that the deinstall doesn’t behave as it should! So very naughty. It recommends downloading and installing yet another utility to manage things. MORE WORK! The downloaded utility is specific to ORACLE versions (ex. linux.x64_11202_deinstall.zip) and operating systems, so now I have to keep more software on hand to accomplish what used to be a simple task using the ORACLE UNIVERSAL INSTALLER.  While this type of script may be useful for certain test environments that depend on extensive scripting capabilities like the deinstall tool, I am wondering how to use it safely to remove software on a production server?

One workaround would be to create a new oraInventory for each $ORACLE_HOME, then you would have to juggle multiple copies of /etc/oraInst.loc (on UNIX) to do maintenance tasks such as patching and upgrades. This would involve some local documentation on your part to keep it all straight. This would allow you to just manually remove old $ORACLE_HOME (s) along with their associated inventory when they are no longer needed. Particularly because database releases are all now required to have their own $ORACLE_HOME. MORE WORK!

How To Deinstall/Uninstall Oracle Home In 11gR2 [ID 883743.1]

“De-installation from new OUI is desupported.

Caution:
When you run the deinstall command, if the central inventory (oraInventory) contains no
other registered homes besides the home that you are deconfiguring and removing, then the deinstall command removes the following files and directory contents in the Oracle base directory of the Oracle Database installation owner:”

(In other words if this is the last ORACLE_HOME it will also remove the following directories under ORACLE_BASE)….so using a single oraInventory (s)/ORACLE_HOME and the deinstall script at the same time may be counterproductive. MORE WORK!

admin
cfgtoollogs
checkpoints
diag
oradata
flash_recovery_area

1) External de-install utility downloadable from OTN ***Recommended method***

It is advised to use the external De-install utility that is downloadable from OTN as currently there are some open bugs with the deinstall script.”

I downloaded the zip file, checked out the readme. It points you to the documentation on the standard use of deinstall. See the following output for how it was run:

>perl deinstall
Tool is being run outside the Oracle Home, -home needs to be set.
deinstall -home <Complete path of Oracle home>
 [ -silent ]
 [ -checkonly ]
 [ -local ]
 [ -paramfile <complete path of input parameter properties file> ]
 [ -params <name1=value[ name2=value name3=value ...]> ]
 [ -o <complete path of directory for saving files> ]
 [ -tmpdir <complete path of temporary directory to use> ]
 [ -help : Type -help to get more information on each of the above options. 
perl deinstall -home ORACLE_HOME_TO_BE_REMOVED  (simplest form of using this script)
###############CHECK OPERATION SUMMARY #######################
 Oracle Home selected for de-install is: /u01/app/oracle/product/11.2.0/dbhome_1
 Inventory Location where the Oracle home registered is: /u01/app/oraInventory
 Skipping Windows and .NET products configuration check
 Following Single Instance listener(s) will be de-configured: LISTENER1
 No Enterprise Manager configuration to be updated for any database(s)
 No Enterprise Manager ASM targets to update
 No Enterprise Manager listener targets to migrate
 Checking the config status for CCR
 Oracle Home exists with CCR directory, but CCR is not configured
 CCR check is finished
 Do you want to continue (y - yes, n - no)? [n]: y
########### CLEAN OPERATION START ########################
Enterprise Manager Configuration Assistant START
EMCA de-configuration trace file location: /u01/app/oraInventory/logs/emcadc_clean2012-05-08_11-03-05-AM.log
Updating Enterprise Manager ASM targets (if any)
 Updating Enterprise Manager listener targets (if any)
 Enterprise Manager Configuration Assistant END
 Database de-configuration trace file location: /u01/app/oraInventory/logs/databasedc_clean2012-05-08_11-03-51-AM.log
Network Configuration clean config START
Network de-configuration trace file location: /u01/app/oraInventory/logs/netdc_clean2012-05-08_11-03-51-AM.log
De-configuring Single Instance listener(s): LISTENER1
De-configuring listener: LISTENER1
 Stopping listener: LISTENER1
 Listener stopped successfully.
 Deleting listener: LISTENER1
 Listener deleted successfully.
 Listener de-configured successfully.
De-configuring backup files...
 Backup files de-configured successfully.
The network configuration has been cleaned up successfully.
Network Configuration clean config END
Oracle Configuration Manager clean START
 OCM clean log file location : /u01/app/oraInventory/logs//ocm_clean_2012-05-08_11-03-05-AM.log
 Oracle Configuration Manager clean END
 Removing Windows and .NET products configuration END
 Oracle Universal Installer clean START
Detach Oracle home '/u01/app/oracle/product/11.2.0/dbhome_1' from the central inventory on the local node : Done
Delete directory '/u01/app/oracle/product/11.2.0/dbhome_1' on the local node : Done
The Oracle Base directory '/u01/app/oracle' will not be removed on local node. The directory is in use by Oracle Home '/u01/app/oracle/agent12c/core/12.1.0.1.0'.
Oracle Universal Installer cleanup was successful.
Oracle Universal Installer clean END
Oracle install clean START
Clean install operation removing temporary directory '/tmp/deinstall2012-05-08_10-58-01AM' on node 'nodename'
Oracle install clean END
############ CLEAN OPERATION END #########################
########## CLEAN OPERATION SUMMARY #######################
 Following Single Instance listener(s) were de-configured successfully: LISTENER1
 Cleaning the config for CCR
 As CCR is not configured, so skipping the cleaning of CCR configuration
 CCR clean is finished
 Skipping Windows and .NET products configuration clean
 Successfully detached Oracle home '/u01/app/oracle/product/11.2.0/dbhome_1' from the central inventory on the local node.
 Successfully deleted directory '/u01/app/oracle/product/11.2.0/dbhome_1' on the local node.
 Oracle Universal Installer cleanup was successful.
Oracle deinstall tool successfully cleaned up temporary directories.
 #######################################################################


Ok…I kinda left this post in nowhere land. What does a person do with the proliferation of ORACLE_HOMEs?   Detach from oraInventory will allow you to safely manually remove the ORACLE_HOME binary files to reclaim space as you need some more to install the next version!  Easy to do, scriptable and reasonably safe task.

Deinstall by Detaching ORACLE_HOME

 

One of the easiest ways to remove an ORACLE_HOME that is no longer needed is to just detach it from the OraInventory. This is less disruptive and faster than running the Oracle-provided deinstall tool – some personal experiences/observations related to using this utility are mentioned after the code in this section. See the following MOS Document:  How To De-install Oracle Home Using runInstaller [ID 1070610.1]

 

./runInstaller -silent -detachHome -invPtrLoc /etc/oraInst.loc ORACLE_HOME=”/u01/app/oracle/product/11.2.0/dbhome_2″

 

oracle@nodename:/u01/app/oracle/product/11.2.0/dbhome_3/oui/bin[SID]

> u01/app/oracle/product/11.2.0/dbhome_3″                                    <

Starting Oracle Universal Installer…

 

Checking swap space: must be greater than 500 MB.   Actual 35913 MB    Passed

The inventory pointer is located at /etc/oraInst.loc

The inventory is located at /u01/app/oraInventory

‘DetachHome’ was successful.

oracle@nodename:/u01/app/oracle/product/11.2.0/dbhome_3/oui/bin[SID]

 

 

The following example is removing a client install:

./runInstaller -silent -debug -force \

 

FROM_LOCATION=/u03/jobsuser/patches/client/stage/products.xml \

 

UNIX_GROUP_NAME=jobsuser \

 

ORACLE_HOME=/u03/jobsuser/product/11.2/client_2 \

 

ORACLE_HOME_NAME=”OraClient11g_Home2″ \

 

ORACLE_BASE=/u03/jobsuser \

 

oracle.install.client.installType=”Administrator”

 

For more information see the MOS Document: Master Note For Cloning Oracle Database Server ORACLE_HOME’s Using the Oracle Universal Installer (OUI) [ID 1154613.1] Another document outlining the changes for Online Patching: RDBMS Online Patching Aka Hot Patching [ID 761111.1]

 

Posted in Uncategorized | 6 Comments

Discounts on Packt Publishing Oracle Books

Calling all High Rollers: Hit the Oracle PacktPot!

Packt Publishing prepares for COLLABORATE 12 with discounts on Oracle titles all month

As a leading publisher of Oracle books/eBooks, Packt Publishing is inviting their readers to celebrate the upcoming COLLABORATE 12 conference being held in Las Vegas from 22nd-26th April.

 

The COLLABORATE 12 conference covers a wide array of topics across all Oracle technologies and applications, including everything from OBIEE and Siebel, to Hyperion and E-Business Suite R12. Over 6,000 attendees are expected in Las Vegas this April for an exciting week of sessions and networking, and Packt’s offer will help any Oracle professionals make the most of the conference, whether they’re attending or just taking an interest from afar.

Packt is offering a range of exciting discounts on their 60+ titles across all areas of Oracle technology including Applications, Database and Fusion Middleware:

  • 20% off all Oracle print books
  • 30% off all Oracle eBooks
  • 10% off Oracle PacktLib subscriptions

To lend a helping hand to those in attendance or professionals who are simply excited about the buzz that the glamorous Las Vegas event will bring, Packt’s discounts apply to all of their Oracle titles and formats.

For further information on Packt’s “Oracle PacktPot” offers in March, visit

http://www.packtpub.com/news/hit-the-oracle-packtpot

Recent Oracle publications include:

Upcoming titles due in 2012:

  • Oracle E-Business Suite Financials R12: A Functionality Guide
  • Oracle Database 11g: Data Warehousing and Business Intelligence Solutions Cookbook
  • Oracle BAM 11gR1 Handbook
Posted in Uncategorized | Leave a comment

2011 in review

The WordPress.com stats helper monkeys prepared a 2011 annual report for this blog.

Here’s an excerpt:

The concert hall at the Syndey Opera House holds 2,700 people. This blog was viewed about 46,000 times in 2011. If it were a concert at Sydney Opera House, it would take about 17 sold-out performances for that many people to see it.

Click here to see the complete report.

Posted in Uncategorized | Leave a comment

Disabling ORACLE Reports to plug SQL Injection Attacks , Don’t do this if you still need Oracle Reports to WORK!

Posting this because I had a hard time disabling all of the reports functionality in Fusion Middleware Server…this will also work for older versions of Oracle Application Server.
Just wanted to alert people to the fact that you may have a major security hole with Oracle Reports Server.
We don’t use it at our site and it is my understanding that it is subject to SQL injection attacks.
First off I would check that Oracle Reports is not available outside your firewall or VPN access.
Once I was made aware of the possible security issue, the next step was figuring out how to disable it.
What I found was this MOS article on disabling the help menu.
How to Disable the Oracle Reports Servlet HELP Command URL? [ID 465454.1]
So…I did it quick and dirty by modifying the httpd.conf adding the code below for all of my application servers (FMW, OAS 10g, etc) and restarting all of the services. It doesn’t seem to take effect if you only restart OHS. Only thing I did different than what the article said was take out the word help so it disables (by not allowing access) to EVERYTHING reports/rwservlet.
<Location /reports/rwservlet/*>
Order deny,allow
Deny from all
</Location>
NOTE: THIS WILL DISABLE ORACLE REPORTS COMPLETELY. Don’t do this if you still want Oracle Reports functionality. Contact your Oracle support team for their best practice on how to make it secure.
See:  Oracle Doc ID 856135.1 How to Deregister Standalone Reports Server 11g From OPMN And Oracle

Posted in Uncategorized | 2 Comments

How to Install FMW 11.1.1.4.0 Standalone Forms/Reports, AKA How to Get to Never Never Land

Hmmm….seems Fusion MW is Peter Panish – a child that doesn’t want to grow up.

Our team is currently using in production the 11.1.1.2.0 FMW (RH 5) install using the Standalone Forms and Reports 32-bit version. Having encountered memory leaks with this version we are now attempting to migrate to 11.1.1.4 64-bit (RH 5).

The documentation to upgrade from 11.1.1.2 to 11.1.1.4 is full of links that take you from one place to another – I am flying over these docs looking for landmarks to get me to where I want to go – Never Never Land. I expect to have lots of fun with little to no consequences when I get there. But alas the following document is giving me fits! Those nasty pirates have been here first.

11.1.1.5.0 seems to be the only ending point (at this writing it doesn’t look to have a standalone Forms Reports install available for 11.1.1.5.0)…what if you need to stop at a different version along the way. AAGH!

http://download.oracle.com/docs/cd/E23104_01/download_readme_ps3/download_readme_ps3.htm#BA BDBHCJ

Following the above documentation, I assume the way to get to Never Never Land is to start with WebLogic 10.1.3.2, install FMW 11.1.1.2.0, verify installation, upgrade Weblogic to 10.1.3.4, install 11.1.1.4.0 patchset. But these steps lead me elsewhere as shown by the following error on starting the WebLogic Node Manager after four successful (no errors reported) product installations.

WARNING: Uncaught exception in server handlerjavax.net.ssl.SSLHandshakeException: FATAL Alert:HANDSHAKE_FAILURE – The handshake handler was unable to negotiate an acceptable set of security parameters. javax.net.ssl.SSLHandshakeException: FATAL Alert:HANDSHAKE_FAILURE – The handshake handler was unable to negotiate an acceptable set of security parameters. at com.certicom.tls.interfaceimpl.TLSConnectionImpl.fireException(Unknown Source) at com.certicom.tls.interfaceimpl.TLSConnectionImpl.fireAlertSent(Unknown Source) at com.certicom.tls.record.handshake.HandshakeHandler.fireAlert(Unknown Source) at com.certicom.tls.record.handshake.HandshakeHandler.fireAlert(Unknown Source) at com.certicom.tls.record.handshake.HandshakeHandler.handleHandshakeMessage(Unknown Source) at com.certicom.tls.record.handshake.HandshakeHandler.handleVersion2HandshakeMessages(Unknown Source) at com.certicom.tls.record.MessageInterpreter.interpretContent(Unknown Source) at com.certicom.tls.record.MessageInterpreter.decryptMessage(Unknown Source) at com.certicom.tls.record.ReadHandler.processRecord(Unknown Source) at com.certicom.tls.record.ReadHandler.readRecord(Unknown Source) Time to talk to the

Go to the Fairies that live in Pixie Hollow to ask for help (AKA entering a Service Request on MOS).

Tinkerbell comes back with a revised install for the 64-bit Version of 11.1.1.4.0 Standalone Forms/Reports:

Please download the following from http://edelivery.oracle.com

  • Oracle WebLogic Server 11gR1 (10.3.4)
  • Oracle Portal, Forms, Reports and Discoverer 11g (11.1.1.2.0) (4 parts)
  • Oracle Portal, Forms, Reports and Discoverer 11g Patch Set 3 (11.1.1.4.0)

Order of installations

  • 1. Install 64 bit JDK – jdk-6u21-linux-x64-rpm.bin (at this writing)
  • 2. Install WLS 10.3.4
  • 3. Install but do not configure FMW 11.1.1.2
  • 4. Install FMW 11.1.1.4
  • 5. execute config.sh from the OracleHome/bin folder

I ask Tinkerbell why is her list different from the documentation from MOS Note: How to Install Fusion Middleware 11g Forms and Reports Only (Note:854117.1).

She brightly sighs, shrugs her shoulders and winks at me! ================================

JDK Install

================================

Since we are using a 64-bit OS, we need the Sun JDK 6 64-bit.

The jdk-6u21-linux-x64-rpm.bin can be downloaded from http://java.sun.com/javase/downloads/widget/jdk6.jsp

Verify the correct version of java is installed.

which java

/usr/java/jdk1.6.0_21/bin/java java -version

In my environment there was a symbolic link for an older version of java 1.4 in /usr/bin/java which I removed. This may not be correct for your system, talk to your system administrator about the best way to install a JDK version for system-wide use.

Also you may run out of OS-level file descriptors if left unset. On Linux I edited /etc/security/limits.conf adding the following entries:

* soft nofile 4096

* hard nofile 4096 # End of file

Again, see your System Administrator especially if you want limit the number of open files by user. The example above allows any logon to have those limits.

================================

Install WebLogic 10.1.3.4

================================

This is a typical installation, change the directory if desired.

Install WebLogic. Unzip the file and execute

java -Xmx1024m -jar wls1034_generic.jar

Welcome Screen

Choose Middleware Home Directory

Create a new Middleware Home – /aux/oracle/middleware

Register for Security Updates

Hit Next to bypass (this is not a mandatory step)

Choose Install Type

Select Typical

JDK Selection

Select the 1.6.0_21 version (probably already selected)

Choose Product Installation Directories

WebLogic Server – /aux/oracle/middleware/wlserver_10.3

Installation Summary

================================

Install Forms Reports 11.1.1.2

================================

The important thing is to install the software only at this time.

Install Forms Reports. Unzip the files and execute

cd Disk1

./runInstaller

Specify Inventory Directory

screen Directory: /aux/oracle/oraInventory

Group name: dba (If prompted, run the createCentralInventory.sh as directed.)

Welcome Screen

Select Installation Type Select Install ONLY!..….you will configure using config.sh script later. Doing this method also installs all of the components but the configuration step allows you to choose which ones to configure.

Prerequisite Checks

If you receive an error like Checking for openmotif-2.2.3; not found; failed, Install the missing rpm

Specify Security Updates

================================

Upgrade Forms Reports 11.1.1.4

================================

Install Forms Reports patchset. Unzip the appropriate file and execute

cd Disk1 ./runInstaller

================================

Configure Forms Reports 11.1.1.4

================================

Execute config.sh from the $Oracle_Home/bin folder. This is a configuration tool that can be run as a GUI or command-line.

Other documents that were helpful to our team:

  • Upgrading Oracle Middleware 11g; How to Check that the Core Components are Running Successfully? [ID 1086348.1]
  • Location Of Different Forms Configuration Files in Fusion Middleware Forms and Reports 11.1.1.1, 11.1.1.2 and 11.1.1.3 Installations [ID 854124.1]
  • Maintain FMW 1073776.1

Disable IPV6 for WebCache

We disabled IPV6 because webcache wouldn’t start….see the instructions below.

14.5.3 Disabling IPv6 Support for Oracle Web Cache

By default, IPv6 support is enabled for Oracle Web Cache. You can disable it in the

webcache.xml file, which is located in the following directory:

(UNIX) ORACLE_INSTANCE/config/WebCache/webcache_name
(Windows) ORACLE_INSTANCE\config\WebCache\webcache_name

In the file, change the value of the IPV6 element to “NO”. For example:

<IPV6 ENABLED=”NO”/>

change LD_LIBRARY

If the IPV6 element does not exist in the webcache.xml file, you can add the element to the

file. Add it after the MULTIPORT element, as shown in the following example:


<LISTEN IPADDR=”ANY” PORT=”7786″ PORTTYPE=”ADMINISTRATION”/>
<LISTEN IPADDR=”ANY” PORT=”7788″ PORTTYPE=”INVALIDATION”/>
<LISTEN IPADDR=”ANY” PORT=”7787″ PORTTYPE=”STATISTICS”/>
</MULTIPORT>
<IPV6 ENABLED=”NO”/>

Posted in Uncategorized | 17 Comments

100% CPU Utilization, Corrupted Password Files on FMW 11.1.1.2

We are experiencing memory leaks and high CPU utilization for Fusion Middleware (FMW) Version 11.1.1.2  that gradually builds until the server becomes completely unresponsive. The last time this happened rebooting the server did not bring up all of the FMW components.

Errors in the Browser trying to access Oracle Forms:

Failure of server APACHE bridge:
No backend server available for connection: timed out after 10 seconds or idempotent set to OFF.

Errors in the AdminServer.log for Weblogic:

####<May 3, 2011 12:26:38 PM MDT> <Warning> <DeploymentService> <AdminServer> <[ACTIVE] ExecuteThread: ‘0’ for queue: ‘weblogic.kernel.Default (self-tuning)’> <<anonymous>> <> <> <BEA-290014> <Invalid user name or password.>

####<May 3, 2011 12:26:38 PM MDT> <Warning> <DeploymentService> <> <AdminServer> <[ACTIVE] ExecuteThread: ‘0’ for queue: ‘weblogic.kernel.Default (self-tuning)’> <<anonymous>> <> <> <> <BEA-290014> <Invalid user name or password.>

####<May 3, 2011 12:26:42 PM MDT> <Error> <Configuration Management> <> <AdminServer> <[ACTIVE] ExecuteThread: ‘0’ for queue: ‘weblogic.kernel.Default (self-tuning)’> <<WLS Kernel>> <> <> <> <BEA-150035> <An attempt was made to download the configuration for the server WLS_REPORTS by the user with an invalid password.>

####<May 3, 2011 12:26:44 PM MDT> <Error> <Configuration Management> <> <AdminServer> <[ACTIVE] ExecuteThread: ‘0’ for queue: ‘weblogic.kernel.Default (self-tuning)’> <<WLS Kernel>> <> <> <> <BEA-150035> <An attempt was made to download the configuration for the server WLS_FORMS by the user with an invalid password.>

####<Apr 1, 2011 10:58:46 AM MDT> <Notice> <Security> <> <AdminServer> <[ACTIVE] ExecuteThread: ‘4’ for queue: ‘weblogic.kernel.Default (self-tuning)’> <<anonymous>> <> <> <> <BEA-090078> <User  in security realm myrealm has had 5 invalid login attempts, locking account for 30 minutes.>

Searching on MOS came up with two different articles:

Getting Error “User orcladmin In Security Realm Myrealm Has Had 5 Invalid Login Attempts, Locking Account For 30 Minutes.” [ID 1270253.1]

Oracle Middleware 11g – Troubleshooting the Error “Failure of server APACHE bridge” [ID 1304095.1]

Neither MOS document helped as the user was blank in the logs, what user? Notice the space after user in this section: “User  in security realm myrealm”. This is a standalone Forms/Reports install of FMW so there are very few users.

I changed the password for the WEBLOGIC admin account using the GUI Console which really didn’t help. Aha! We had created boot.properties files for starting all services, somehow those were no longer working (corrupt?) I recreated them and restarted all of the services successfully.

For a more permanent fix, we are upgrading to FMW 11.1.1.4 on 64bit RH 5 in a couple of weeks……I don’t recommend using 32-bit OS for FMW.

Posted in Uncategorized | 2 Comments