Zero Downtime, REST, Domain Partitions / Multi Tenancy, Elasticity and WLDF. WebLogic 12.2.1 (12c)

I just finished a two week long hands-on consulting session for some pretty experienced application managers and architects at one of Europe’s premier financial services institution.

In 5 days we explored WebLogic 12.2.1 extensively:

  • Zero Downtime
  • REST
  • Domain Partitions / Multi Tenancy
  • Resource Group Management
  • Java Mission Control
  • WLST
  • Elasticity
  • JMS Clustering
  • WLDF



Here is some feedback from the group. You can tell we had fun, although we worked very hard.

Screen Shot 2016-07-04 at 10.43.06


This is how a happy group looks like.



People seemed to be happy, here is what they liked.

Screen Shot 2016-07-04 at 10.44.13

For more details download the flyer from the Oracle WebLogic Server 12.2.1 (12c) course site.

Cloud Survey: Computerworld Quotes

Update as of Nov 2014: You can get the whole survey now from here (registration required though).


Cloud is more popular than ever. Even the slower, more traditional software vendors now pick up the concepts I wrote about back in 2011. Keep going!

Find attached some quotes from the Computerworld magazine.

computerworld cloud survey

RESTful WebLogic Monitoring of Servers, Applications, JDBC and JMS with Jolokia

[This posting will be a part of my upcoming WebLogic 12c book]

This article is part II of III. Make sure you read part I explaining the basics of monitoring WebLogic with WLST, the WebLogic 12c REST-ful management API and the open source framework Jolokia.

There will be a part III in a few of days, explaining how to use j4psh with WebLogic. j4psh is a JMX shell that resembles WLST in interactive mode but includes features such as syntax highlighting and tab-completion of commands and MBean names. j4psh is highly useful to interactively find out the correct MBean names used for the requests below.

WebLogic with Jolokia

Did you ever wonder how to retrieve the configuration values for a WebLogic managed server, runtime data of a JDBC connection pool or the number of messages in a particular JMS queue with a fast and simple HTTP GET?

The examples show typical use cases for monitoring a WebLogic domain with the following configuration:

– Admin server running at localhost:7001
– Managed server with name surf1 running at localhost:7003
– JDBC datasource with the name emergencyDB and associated connection pool (target set to surf1)
– JMS configuration with a JMS server surfJMS running on managed server surf1, a module surfJMSModule containing a queue name jms/ShippingRequestQueue (target of JMS server is surf1, JMS module subdeployment set to surfJMS)
– A deployment of a web service with the name SubmitOrder and target AdminServer and surf1.

An installation of Jolokia with target set to all server of the domain is recommended to follow the examples in this part.


General: List Configuration Details of Managed Server with Name surf1


Note, that without the ?ignoreErrors=true parameter the request would fail with an “NoAccessRuntimeException” because you cannot access the SSL keystore passphrases without proper authentication. Authentication is easily possible, but beyond of the scope of this article.

{ "request" : { "mbean" : "com.bea:Name=surf1,Type=Server",
      "type" : "read"
  "status" : 200,
  "timestamp" : 1334778534,
  "value" : { "AcceptBacklog" : 300,
      "AddWorkManagerThreadsByCpuCount" : false,
      "AdminReconnectIntervalSeconds" : 10,
      "AdministrationPort" : 9002,
      "AdministrationPortEnabled" : false,
      "AdministrationProtocol" : "t3s",
      "AutoKillIfFailed" : false,
      "AutoMigrationEnabled" : false,
      "AutoRestart" : true,
      "COM" : { "objectName" : "com.bea:Name=surf1,Server=surf1,Type=COM" },
      "COMEnabled" : false,
      "CandidateMachines" : [  ],
      "ClasspathServletDisabled" : false,
      "ClientCertProxyEnabled" : false,
      "Cluster" : { "objectName" : "com.bea:Name=surfCluster,Type=Cluster" },
      "ClusterRuntime" : null,
      "ClusterWeight" : 100,
      "CoherenceClusterSystemResource" : null,
      "CompleteCOMMessageTimeout" : -1,
      "CompleteHTTPMessageTimeout" : -1,
      "CompleteIIOPMessageTimeout" : -1,
      "CompleteMessageTimeout" : 60,
      "CompleteT3MessageTimeout" : -1,
      "ConnectTimeout" : 0,
      "ConsensusProcessIdentifier" : -1,
      "ConsoleInputEnabled" : false,
      "CustomIdentityKeyStoreFileName" : null,
      "CustomIdentityKeyStorePassPhrase" : "ERROR: Access to sensitive attribute in clear text is not allowed due to the setting of ClearTextCredentialAccessEnabled attribute in SecurityConfigurationMBean. Attr: CustomIdentityKeyStorePassPhrase, MBean name: com.bea:Name=surf1,Type=Server (class",
      "CustomIdentityKeyStorePassPhraseEncrypted" : "ERROR: Access not allowed for subject: principals=[], on ResourceType: Server Action: read, Target: CustomIdentityKeyStorePassPhraseEncrypted (class",
      "CustomIdentityKeyStoreType" : null,
      "CustomTrustKeyStoreFileName" : null,
      "CustomTrustKeyStorePassPhrase" : "ERROR: Access to sensitive attribute in clear text is not allowed due to the setting of ClearTextCredentialAccessEnabled attribute in SecurityConfigurationMBean. Attr: CustomTrustKeyStorePassPhrase, MBean name: com.bea:Name=surf1,Type=Server (class",
      "CustomTrustKeyStorePassPhraseEncrypted" : "ERROR: Access not allowed for subject: principals=[], on ResourceType: Server Action: read, Target: CustomTrustKeyStorePassPhraseEncrypted (class",
      "CustomTrustKeyStoreType" : null,
      "DGCIdlePeriodsUntilTimeout" : 5,
      "DataSource" : { "objectName" : "com.bea:Name=surf1,Server=surf1,Type=DataSource" },
      "DefaultFileStore" : { "objectName" : "com.bea:Name=surf1,Server=surf1,Type=DefaultFileStore" },
      "DefaultIIOPPassword" : "ERROR: Access to sensitive attribute in clear text is not allowed due to the setting of ClearTextCredentialAccessEnabled attribute in SecurityConfigurationMBean. Attr: DefaultIIOPPassword, MBean name: com.bea:Name=surf1,Type=Server (class",
      "DefaultIIOPPasswordEncrypted" : "ERROR: Access not allowed for subject: principals=[], on ResourceType: Server Action: read, Target: DefaultIIOPPasswordEncrypted (class",
      "DefaultIIOPUser" : null,
      "DefaultInternalServletsDisabled" : false,
      "DefaultProtocol" : "t3",
      "DefaultSecureProtocol" : "t3s",
      "DefaultTGIOPPassword" : "ERROR: Access to sensitive attribute in clear text is not allowed due to the setting of ClearTextCredentialAccessEnabled attribute in SecurityConfigurationMBean. Attr: DefaultTGIOPPassword, MBean name: com.bea:Name=surf1,Type=Server (class",
      "DefaultTGIOPPasswordEncrypted" : "ERROR: Access not allowed for subject: principals=[], on ResourceType: Server Action: read, Target: DefaultTGIOPPasswordEncrypted (class",
      "DefaultTGIOPUser" : "guest",
      "ExecuteQueues" : [  ],
      "ExpectedToRun" : true,
      "ExternalDNSName" : null,
      "ExtraEjbcOptions" : null,
      "ExtraRmicOptions" : null,
      "FederationServices" : { "objectName" : "com.bea:Name=surf1,Server=surf1,Type=FederationServices" },
      "GatheredWritesEnabled" : false,
      "GracefulShutdownTimeout" : 0,
      "HealthCheckIntervalSeconds" : 180,
      "HealthCheckStartDelaySeconds" : 120,
      "HealthCheckTimeoutSeconds" : 60,
      "HostsMigratableServices" : true,
      "HttpTraceSupportEnabled" : false,
      "HttpdEnabled" : true,
      "IIOP" : { "objectName" : "com.bea:Name=surf1,Server=surf1,Type=IIOP" },
      "IIOPEnabled" : true,
      "IIOPTxMechanism" : "ots",
      "IdleConnectionTimeout" : 65,
      "IdleIIOPConnectionTimeout" : -1,
      "IdlePeriodsUntilTimeout" : 4,
      "IgnoreSessionsDuringShutdown" : false,
      "InstrumentStackTraceEnabled" : true,
      "InterfaceAddress" : null,
      "JDBCLLRTableName" : null,
      "JDBCLoggingEnabled" : false,
      "JDBCLoginTimeoutSeconds" : 0,
      "JMSDefaultConnectionFactoriesEnabled" : true,
      "JMSThreadPoolSize" : 15,
      "JNDITransportableObjectFactoryList" : [  ],
      "JTAMigratableTarget" : { "objectName" : "com.bea:Name=surf1,Server=surf1,Type=JTAMigratableTarget" },
      "JavaCompiler" : "javac",
      "JavaCompilerPostClassPath" : null,
      "JavaCompilerPreClassPath" : null,
      "JavaStandardTrustKeyStorePassPhrase" : "ERROR: Access to sensitive attribute in clear text is not allowed due to the setting of ClearTextCredentialAccessEnabled attribute in SecurityConfigurationMBean. Attr: JavaStandardTrustKeyStorePassPhrase, MBean name: com.bea:Name=surf1,Type=Server (class",
      "JavaStandardTrustKeyStorePassPhraseEncrypted" : "ERROR: Access not allowed for subject: principals=[], on ResourceType: Server Action: read, Target: JavaStandardTrustKeyStorePassPhraseEncrypted (class",
      "KernelDebug" : { "objectName" : "com.bea:Name=surf1,Server=surf1,Type=ServerDebug" },
      "KeyStores" : "DemoIdentityAndDemoTrust",
      "ListenAddress" : "",
      "ListenDelaySecs" : 0,
      "ListenPort" : 7003,
      "ListenPortEnabled" : true,
      "ListenThreadStartDelaySecs" : 60,
      "ListenersBindEarly" : false,
      "Log" : { "objectName" : "com.bea:Name=surf1,Server=surf1,Type=Log" },
      "LogRemoteExceptionsEnabled" : false,
      "LoginTimeoutMillis" : 5000,
      "LowMemoryGCThreshold" : 5,
      "LowMemoryGranularityLevel" : 5,
      "LowMemorySampleSize" : 10,
      "LowMemoryTimeInterval" : 3600,
      "MSIFileReplicationEnabled" : false,
      "Machine" : null,
      "ManagedServerIndependenceEnabled" : true,
      "MaxCOMMessageSize" : -1,
      "MaxHTTPMessageSize" : -1,
      "MaxIIOPMessageSize" : -1,
      "MaxMessageSize" : 10000000,
      "MaxOpenSockCount" : -1,
      "MaxT3MessageSize" : -1,
      "MessageIdPrefixEnabled" : false,
      "MessagingBridgeThreadPoolSize" : 5,
      "MuxerClass" : null,
      "NMSocketCreateTimeoutInMillis" : 180000,
      "Name" : "surf1",
      "NativeIOEnabled" : true,
      "NetworkAccessPoints" : [  ],
      "Notes" : null,
      "OutboundEnabled" : false,
      "OutboundPrivateKeyEnabled" : false,
      "OverloadProtection" : { "objectName" : "com.bea:Name=surf1,Server=surf1,Type=OverloadProtection" },
      "Parent" : { "objectName" : "com.bea:Name=surfandconsulting,Type=Domain" },
      "PeriodLength" : 60000,
      "PreferredSecondaryGroup" : null,
      "ReliableDeliveryPolicy" : null,
      "ReplicationGroup" : null,
      "ReplicationPorts" : null,
      "RestartDelaySeconds" : 0,
      "RestartIntervalSeconds" : 3600,
      "RestartMax" : 2,
      "ReverseDNSAllowed" : false,
      "SSL" : { "objectName" : "com.bea:Name=surf1,Server=surf1,Type=SSL" },
      "ScatteredReadsEnabled" : false,
      "SelfTuningThreadPoolSizeMax" : 400,
      "SelfTuningThreadPoolSizeMin" : 1,
      "ServerDebug" : { "objectName" : "com.bea:Name=surf1,Server=surf1,Type=ServerDebug" },
      "ServerDiagnosticConfig" : { "objectName" : "com.bea:Name=surf1,Server=surf1,Type=WLDFServerDiagnostic" },
      "ServerLifeCycleTimeoutVal" : 30,
      "ServerStart" : { "objectName" : "com.bea:Name=surf1,Server=surf1,Type=ServerStart" },
      "ServerVersion" : "unknown",
      "SingleSignOnServices" : { "objectName" : "com.bea:Name=surf1,Server=surf1,Type=SingleSignOnServices" },
      "SocketBufferSizeAsChunkSize" : false,
      "SocketReaders" : -1,
      "StagingDirectoryName" : "/u01/domains/surfandconsulting/servers/surf1/stage",
      "StagingMode" : "stage",
      "StartupMode" : "RUNNING",
      "StartupTimeout" : 0,
      "StuckThreadMaxTime" : 600,
      "StuckThreadTimerInterval" : 60,
      "SystemPasswordEncrypted" : "ERROR: Access not allowed for subject: principals=[], on ResourceType: Server Action: read, Target: SystemPasswordEncrypted (class",
      "TGIOPEnabled" : true,
      "ThreadPoolPercentSocketReaders" : 33,
      "TimedOutRefIsolationTime" : 0,
      "TransactionLogFilePrefix" : "./",
      "TransactionLogFileWritePolicy" : "Direct-Write",
      "TransactionLogJDBCStore" : { "objectName" : "com.bea:Name=surf1,Server=surf1,Type=TransactionLogJDBCStore" },
      "TunnelingClientPingSecs" : 45,
      "TunnelingClientTimeoutSecs" : 40,
      "TunnelingEnabled" : false,
      "Type" : "Server",
      "UploadDirectoryName" : "./servers/surf1/upload",
      "Use81StyleExecuteQueues" : false,
      "UseConcurrentQueueForRequestManager" : false,
      "UseFusionForLLR" : false,
      "VerboseEJBDeploymentEnabled" : "false",
      "VirtualMachineName" : "surfandconsulting_surf1",
      "WebServer" : { "objectName" : "com.bea:Name=surf1,Server=surf1,Type=WebServer" },
      "WebService" : { "objectName" : "com.bea:Name=surf1,Server=surf1,Type=WebService" },
      "WeblogicPluginEnabled" : false,
      "XMLEntityCache" : null,
      "XMLRegistry" : null

Deployment: List all Deployments

{ "request" : { "mbean" : "com.bea:Type=ApplicationRuntime,*",
 "type" : "search"
 "status" : 200,
 "timestamp" : 1334780578,
 "value" : [ "com.bea:Name=mejb,ServerRuntime=AdminServer,Type=ApplicationRuntime",

Deployment: List Deployments Details for an Application with the Name SubmitOrder

{ "request" : { "mbean" : "com.bea:Name=SubmitOrder,ServerRuntime=*,Type=ApplicationRuntime",
      "type" : "read"
  "status" : 200,
  "timestamp" : 1334777478,
  "value" : { "com.bea:Name=SubmitOrder,ServerRuntime=AdminServer,Type=ApplicationRuntime" : { "ActiveVersionState" : 2,
          "ApplicationName" : "SubmitOrder",
          "ApplicationVersion" : null,
          "ClassRedefinitionRuntime" : null,
          "CoherenceClusterRuntime" : null,
          "ComponentRuntimes" : [ { "objectName" : "com.bea:ApplicationRuntime=SubmitOrder,Name=AdminServer_/SubmitOrder,ServerRuntime=AdminServer,Type=WebAppComponentRuntime" } ],
          "EAR" : false,
          "HealthState" : { "critical" : false,
              "mBeanName" : null,
              "mBeanType" : null,
              "reasonCode" : [  ],
              "state" : 0,
              "subsystemName" : null
          "KodoPersistenceUnitRuntimes" : [  ],
          "LibraryRuntimes" : null,
          "MaxThreadsConstraintRuntimes" : [  ],
          "MinThreadsConstraintRuntimes" : [  ],
          "Name" : "SubmitOrder",
          "OptionalPackageRuntimes" : [  ],
          "Parent" : { "objectName" : "com.bea:Name=AdminServer,Type=ServerRuntime" },
          "PersistenceUnitRuntimes" : [  ],
          "QueryCacheRuntimes" : [  ],
          "RequestClassRuntimes" : [  ],
          "Type" : "ApplicationRuntime",
          "WorkManagerRuntimes" : [ { "objectName" : "com.bea:ApplicationRuntime=SubmitOrder,Name=default,ServerRuntime=AdminServer,Type=WorkManagerRuntime" } ],
          "WseeRuntimes" : [ { "objectName" : "com.bea:ApplicationRuntime=SubmitOrder,Name=SubmitOrder!EntryWSProdService,ServerRuntime=AdminServer,Type=WseeRuntime" },
              { "objectName" : "com.bea:ApplicationRuntime=SubmitOrder,Name=SubmitOrder!EntryWSService,ServerRuntime=AdminServer,Type=WseeRuntime" }
          "WseeV2Runtimes" : [  ]
        } }

JDBC Data Source: Retrieve Settings for emergencyDB Data Source with Target Managed Server surf1 Running at localhost:7003

{ "request" : { "mbean" : "com.bea:Name=emergencyDB,ServerRuntime=surf1,Type=JDBCDataSourceRuntime",
      "type" : "read"
  "status" : 200,
  "timestamp" : 1334823502,
  "value" : { "ActiveConnectionsAverageCount" : 0,
      "ActiveConnectionsCurrentCount" : 0,
      "ActiveConnectionsHighCount" : 1,
      "ConnectionDelayTime" : 44,
      "ConnectionsTotalCount" : 1,
      "CurrCapacity" : 1,
      "CurrCapacityHighCount" : 1,
      "DatabaseProductName" : "Apache Derby",
      "DatabaseProductVersion" : " - (938214)",
      "DeploymentState" : 2,
      "DriverName" : "Apache Derby Network Client JDBC Driver",
      "DriverVersion" : " - (938214)",
      "Enabled" : true,
      "FailedReserveRequestCount" : 0,
      "FailuresToReconnectCount" : 0,
      "HighestNumAvailable" : 1,
      "HighestNumUnavailable" : 1,
      "JDBCDriverRuntime" : { "objectName" : "com.bea:Name=surfandconsulting_surf1_org.apache.derby.jdbc.ClientDriver,ServerRuntime=surf1,Type=JDBCDriverRuntime" },
      "LastTask" : null,
      "LeakedConnectionCount" : 0,
      "ModuleId" : "emergencyDB",
      "Name" : "emergencyDB",
      "NumAvailable" : 1,
      "NumUnavailable" : 0,
      "Parent" : { "objectName" : "com.bea:Name=surf1,Type=ServerRuntime" },
      "PrepStmtCacheAccessCount" : 0,
      "PrepStmtCacheAddCount" : 0,
      "PrepStmtCacheCurrentSize" : 0,
      "PrepStmtCacheDeleteCount" : 0,
      "PrepStmtCacheHitCount" : 0,
      "PrepStmtCacheMissCount" : 0,
      "Properties" : "ERROR: MBean getAttribute failed: weblogic.common.resourcepool.ResourcePermissionsException: User \"\" does not have permission to perform operation \"admin\" on resource \"emergencyDB\" of module \"null\" of application \"null\" of type \"ConnectionPool\" (class",
      "ReserveRequestCount" : 1,
      "State" : "Running",
      "Type" : "JDBCDataSourceRuntime",
      "VersionJDBCDriver" : "org.apache.derby.jdbc.ClientDriver",
      "WaitSecondsHighCount" : 0,
      "WaitingForConnectionCurrentCount" : 0,
      "WaitingForConnectionFailureTotal" : 0,
      "WaitingForConnectionHighCount" : 0,
      "WaitingForConnectionSuccessTotal" : 0,
      "WaitingForConnectionTotal" : 0,
      "WorkManagerRuntimes" : null

JDBC Connection Pool: Find all JMS Servers with Target set to Managed Server Running at localhost:7003


which returns the JMS server surfJMS:

{ "request" : { "mbean" : "com.bea:Name=emergencyDB,ServerRuntime=surf1,Type=JDBCConnectionPoolRuntime",
      "type" : "read"
  "status" : 200,
  "timestamp" : 1334823105,
  "value" : { "ActiveConnectionsAverageCount" : 0,
      "ActiveConnectionsCurrentCount" : 0,
      "ActiveConnectionsHighCount" : 0,
      "ConnectionDelayTime" : 44,
      "ConnectionLeakProfileCount" : 0,
      "ConnectionsTotalCount" : 1,
      "CurrCapacity" : 1,
      "DeploymentState" : 2,
      "Enabled" : true,
      "FailuresToReconnectCount" : 0,
      "HighestNumAvailable" : 1,
      "HighestNumUnavailable" : 0,
      "LeakedConnectionCount" : 0,
      "MaxCapacity" : 15,
      "ModuleId" : "emergencyDB",
      "Name" : "emergencyDB",
      "NumAvailable" : 1,
      "NumUnavailable" : 0,
      "Parent" : { "objectName" : "com.bea:Name=surf1,Type=ServerRuntime" },
      "PoolState" : true,
      "Properties" : "ERROR: MBean getAttribute failed: weblogic.common.resourcepool.ResourcePermissionsException: User \"\" does not have permission to perform operation \"admin\" on resource \"emergencyDB\" of module \"null\" of application \"null\" of type \"ConnectionPool\" (class",
      "State" : "Running",
      "StatementProfileCount" : 0,
      "Type" : "JDBCConnectionPoolRuntime",
      "VersionJDBCDriver" : "org.apache.derby.jdbc.ClientDriver",
      "WaitSecondsHighCount" : 0,
      "WaitingForConnectionCurrentCount" : 0,
      "WaitingForConnectionHighCount" : 0,
      "WorkManagerRuntimes" : null

JMS: Find all JMS Servers with Target set to Managed Server Running at localhost:7003


which returns the JMS server surfJMS:


JMS: Runtime Properties for a Particular Queue

Request the MessagesCurrentCount, MessagesHighCount and MessagesReceivedCount for a queue with the name jms/ShippingRequestQueue (note the “!” is used in the URL to escape the “/” character in the queue name) in a JMS module with the name surfJMSModule (note that JMS module and queue name are typically separated in the MBean Name attribute with a “!” which has to be escaped by another “!”).


which returns the following JSON structure.


The output for the queue monitoring shows that there was a total of 3 messages sent, with a maximum of 1 in the queue (because of a deployed receiver retrieving the messages immediately) and currently 0 message in the queue.


WebLogic Stuck Threads: Creating, Understanding and Dealing with them

Using the time off during the bank holidays over Easter I spent some time coding and looking into more unknown details of WebLogic stuck thread behavior. (Actually I started to write this posting because I was told by my doctor to keep my mouth shut for some days, but that’s another story…).

My personal task was to answer some of the most common questions I’ve encountered while consulting and running WebLogic 12c workshops. As often with my postings, this article is not meant to explain the basic concept of thread pools or workmanagers. I recommend to read the Oracle WebLogic 12c documentation about stuck thread handling first which explains how you can deal with stuck threads by configuring a workmanager.

Also there are some excellent details about stuck threads (including WLST scripting and monitoring) to be found at the Middleware Magic site – a site run by a group of really knowledgeable guys.


Now, typically customers tell me that they “observe some stuck threads”, “sometimes”, but often they are “not sure what caused them” and typically they “don’t know what exact state these thread are in” and in addition nobody seems to know if “the stuck threads ever clear up again without rebooting”. I am a pragmatic guy. I enjoy having little applications or tools to demonstrate and measure how WebLogic is working. Keen to play around with the newest edition of Netbeans (I used to be an Eclipse guy) and EJB 3.1 in WLS12c I built a small application to easily test WebLogic stuck thread settings and countermeasures.

Here are some more details about the StuckThreadForFree application:

  • The application allows you to create threads which are busy or which are waiting long enough to be detected as “stuck” by WebLogic.
  • This little application will only work with WLS12c. I intentionally avoided JSF, so a plain JSP page is used to set your parameters. The JSP is calling a simple Servlet which in a for loop is calling an asynchronous business method of an injected  stateless session bean. @Asynchronous and no-interface session beans are only available in EJB3.1 so you have to run it on WLS12c. Unlike in previous versions, the EJB is directly packaged into the .war file for deployment.
  • Every call to the stateless session bean is serialized by the EJB container, so every EJB method is executing in its own thread.
  • Depending on which method was called on the EJB is either waiting n seconds using Thread.sleep() or calculating some trigonometric function for n seconds. Both methods will cause stuck threads.
  • There is zero configuration in the deployment descriptors for the EJB! Only context-root for the web part is set (which could be avoided as well).
  • Building the StuckThreadForFree app with Netbeans was a smooth ride and a real pleasure.
  • The app is provided as is. You can have it for free, yet there is no guarantee for anything but it shouldn’t cause any problems either. Better don’t run it on your production system.
  • It’s just a hack. It demonstrates what it should, nothing else.

DOWNLOAD: for your conveninience you can download the StuckThreadForFree.war from here and follow the example yourself (here is the link to the whole Netbeans project). After downloading you can easily deploy it to WebLogic. To follow the example it’s good enough to run it the admin server. Then you can start with the following URL:


Now, lets use the app to answer some typical questions.

What are hogging threads? When do threads become hogged? After what period of time?

According to the Oracle doc hogging threads “.. will either be declared as stuck after the configured timeout or will return to the pool before that. The self-tuning mechanism will backfill if necessary.”

So how long does it take for them to become hogged? Nobody (including Google) seemed to know. Trust me I did some research and asked plenty of colleagues about this. Here is the answer:

If you run the application with 3 threads / 100 seconds / Thread.sleep() and immediately switch to the WebLogic 12c admin console Admin Server / Monitoring / Threads you will observe the following:


So interestingly hogging threads are detected right away! In my case it took about 2 seconds (I had to hit reload once).


So WebLogic transitions into FAILED state when a certain number of stuck threads are detected, right? 

That’s a common misconception! The default configuration of WLS 12c (I also checked for WLS 11 = 10.3.3) is Stuck Thread Count = 0, which means the server “never transitions into FAILED server irrespective of the number of stuck threads”. You will only see the FAILED state only when you set the value to a positive number of threads!

Once the server transitions into FAILED, you can define if WLS should be shut down (and restarted by WLS nodemanager) or suspended.


Remember: WLS will not transition into FAILED state when StuckThreadCount is set to zero. Only the health runtime value is set to Warning (but this will be cleared if the hogging thread conditions clears) as shown below:


What exactly causes a stuck thread? What state does a thread have to be in to be marked as stuck?

In general there is a number of different thread states in Java: NEWRUNNABLEBLOCKEDWAITINGTIMED_WAITINGTERMINATED.

But which state has a thread to be in to be marked as stuck later? If you run the StuckThreadForFree application and create a stack trace with WebLogic admin console under Server / ServerName / Monitoring / Threads you can observe that the thread state is ACTIVE/TIMED_WAITING when using the Thread.sleep() method to block it:


"[ACTIVE] ExecuteThread: '5' for queue: 'weblogic.kernel.Default (self-tuning)'" TIMED_WAITING
            	java.lang.Thread.sleep(Native Method)
            	com.munzandmore.stuckthread.LongRunningEJB_x9v26k_NoIntfViewImpl.__WL_invoke(Unknown Source)



when using the calc() method to keep the threads busy they are state ACTIVE/RUNNABLE :

"[ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)'" RUNNABLE
            	com.munzandmore.stuckthread.LongRunningEJB_x9v26k_NoIntfViewImpl.__WL_invoke(Unknown Source)

So both states can become stuck. Also, I am pretty sure I could also show the BLOCKED state when using a monitor lock for synchronization but due to time restrictions this is not included in the app.


Can a stuck thread still do reasonable work?

Absolutely! Just because a thread is marked as stuck it doesn’t mean it is frozen or unusable. Imagine you wanted to calculate PI, you are creating PDFs, distance maps, mapping the human genome or you have deployed some JCA adapter talking to MQ-Series, SAP or PeopleSoft which is internally using a Thread.sleep() method call. All of this is are reasonable usages likely to occur in the wild.


Do stuck threads ever dissapear? Can they be cleared somehow? Are they stuck forever?

First of all you cannot get rid of a stuck thread by simply “killing it”. You cannot cancel or kill any thread in Java. However, stuck threads automatically will disappear if the condition clears up which caused them to be marked as stuck (e.g. the sleep period is over or the calculation is done).

To prove the point, switch to the WebLogic admin console and under Server / ServerName / Configuration set StuckThreadCount to 3 and StuckThreadTime to 60 seconds then restart the server and run the StuckThreadForFree app to create 3 threads running for 120 seconds using the Thread.sleep() method (the other method will work as well, there is no difference, but keeping 3 threads busy by doing math proves to be a fan test of your machine as well):



In the WebLogic log file you will find three entries logging the stuck thread state after a while:

<05.04.2012 10:55 Uhr MESZ> <Critical> <WebLogicServer> <BEA-000385> <Server health failed. Reason: health of
critical service 'Thread Pool' failed>
<05.04.2012 10:55 Uhr MESZ> <Error> <WebLogicServer> <BEA-000337> <[STUCK] ExecuteThread: '4' for queue: 'webl
ogic.kernel.Default (self-tuning)' has been busy for "85" seconds working on the request "Workmanager: default
, Version: 1, Scheduled=false, Started=true, Started time: 85443 ms
", which is more than the configured time (StuckThreadMaxTime) of "60" seconds. Stack trace:
 java.lang.Thread.sleep(Native Method)


After waiting about one minute you will observe that WebLogic  is transitioning into FAILED state as configured:


Wait another minute, then check the thread states under Server / ServerName / Monitoring / Threads which reveals the following:


So once the condition causing the stuck threads is cleared also the stuck threads will disappear again! Stuck threads are not stuck forever. Phew!


When should I use StuckThreadCount in the admin console or a Workmanager stuck-thread setting then?

Very good question. Use StuckThreadCount from the WebLogic admin console or with a <work-manager-shutdown-trigger> definition moving the application into ADMIN mode if you can react on the FAILED state.

Do not use StuckThreadCount if the threads might be doing something useful and you cannot react on the situation anyway. Obviously transitioning into FAILED state and restarting WLS with the nodemanager is counterproductive if you threads are doing something useful.




The following posting shows how simple tools like ps, top and jcmd can track down the exact line of Java code causing a thread to use a high amount of CPU. Exactly the same StuckThreadForFree application is used as here.


Even More?

Update: There is great follow-up post now by Mark Otting showing you how to use WLST to monitor stuck threads, including a ready to run WLST script! Actually something I was talking about just a while ago at a customer site. Thanks to Mark you can download it.

Oracle WebLogic JMS Queues or AWS Cloud Simple Queue Service (SQS)

This is a shortened extract of my my book Middleware and Cloud Computing.

AWS Simple Queue Service

Amazon’s Simple Queue Service (SQS) is a cloud service for reliable messaging. The SQS service with its queues is located off-host. So, similar to the elastic load balancing service, or the relational database service, you can use the service without having to start an EC2 instance.


SQS is available in all four AWS regions with the same pricing. All regions are independent of each other so messages can never be in-between regions. Queue names have to be unique per region.

Highly available

Queues are highly available: Messages waiting in queues for their delivery are stored redundantly on multiple servers and in multiple data centers.

queue size

There is no limit for the number of messages or the size of a particular queue. One message body can be up to 64 KB of text in any format (default is 8KB). For larger messages you have to store the message somewhere else reliably, e.g. in S3, SimpleDB or RDS, and pass around a reference to the storage location instead of passing the message itself.

Message expiry

When a message remains in a queue (because there is no receiver removing the message from the queue), the message expires after a default of four days (or a configurable maximum of 14 days).

After receiving a message from a queue, the message is locked for a configurable timeout. While the message is locked it is invisible to other receivers. SQS uses this mechanism to ensure that messages are delivered once.

It’s the receiver’s responsibility to explicitly delete the message when it is processed successfully. If the receiver fails before it is able to delete the message, then the message becomes visible again after the timeout, and another receiver can receive it.

Access to queues is restricted to the AWS account owners, but you can specify in an access policy statement that a queue will be shared.

or encryption

Encryption is not a built-in SQS feature, but depending on your privacy requirements you can consider encrypting the content of your message at an application level. Also, there is no built-in compression feature, but you can compress large messages at an application level before sending them.

At least once

The message delivery semantic is engineered to be “at least once”. This means your applications have to cope with message duplicates.



Access to SQS is purely programmatic. Currently, there are no command-line tools from AWS, and there is no integration for SQS into the AWS management console yet.

There are language bindings for Java, PHP, Perl and C#. Also, the Java Typica library supports SQS.

SQS is ideal for decoupling systems or applications running on EC2. From a design perspective, SQS has many features in common with JMS queues. The most important differences between SQS and JMS queues are listed in Table 1.

Table 1: SQS Comparison with WLS Queues

SQS Queues WebLogic JMS Queues
Max queue size Unlimited Limit depends on JVM heap and persistent store
Best Quality of Service At least once Exactly-once
with transactions
Configurable retries No Yes
Persistence Always Optional
Scalability Inherent With distributed queues
Availability Inherent Whole-server migration 

or JMS service migration

Message Order Not guaranteed Can be enforced even for distributed queues
Configurable quotas No Yes
Configurable flow control No Yes
Auto acknowledge No Yes
Time To Live configuration 1h to 14d 1 ms to ca. 2 mio years
Max message size 64 KB Unlimited,
default is 10,000 KB
Compression No Yes
Billing Free usage tier, then charged per request and data transfer amount Included with WLS


To conclude, SQS is an AWS cloud service that could replace WebLogic JMS queues.

Compared to JMS queues, SQS has fewer features, no auto acknowledgement of messages and no support for exactly-once message delivery. The advantage of SQS over JMS queues is SQS’ inherent availability, the virtually unlimited storage for messages and the zero configuration.

The inherent availability is an especially important factor to consider when deciding between SQS or JMS queues, because the built-in features offered by WebLogic for achieving availability of JMS are restricted in today’s clouds.

SQS is implemented off-instance; therefore, its availability is not affected if a particular EC2 instance becomes unavailable.


Interestingly, there is a cloud service for the counterpart to JMS topics as well. The AWS Simple Notification Service allows you to send messages to more than one receiver using transport protocols such as HTTP, email and even SQS.


In case you are wondering how this relates to Oracle Service Bus: Comparing SQS with Oracle Service Bus is like comparing apples with oranges, because in addition to the built-in JMS, service bus also supports protocol adaption, message flows with content-based routing, and most importantly, it is configuration driven.

In a nutshell: SQS is a queue service for the AWS cloud to decouple systems with message passing. As a cloud service it abstracts the Java EE specific details of JMS – nevertheless SQS is specific to AWS. Currently there is no cloud messaging service offered for the Rackspace cloud. Using an AWS specific service like SQS increases the effort to migrate to another cloud provider (and limits your possibilities to quickly switch to another cloud provider as a part of a contingency plan).


There is a free usage tier for up to 100,000 requests per month. Beyond that, Amazon adds $0.01 per 10,000 SQS requests to your bill.

In addition, you have to pay for the data transfer as shown the figure below. Only data transferred between SQS and EC2 within a single region is free. Data transferred between different regions will be charged at Internet data transfer rates on both ends.



More details on my Middleware and Cloud Computing book.