Quantcast
Channel: Archives des Oracle - dbi Blog
Viewing all 462 articles
Browse latest View live

EM13c Gold Agent Image

$
0
0

One of the most interseting new feature in Enterprise Manager 13c is the Gold Agent Image. This new feature will simplify the agent management.

The first thing to do is to create a Gold Agent image, from the setup menu , select Gold Agent Image:

ga1

Then select Manage All Images:

ga2

Select Create

ga3

Enter the image name, a description and the platform name, then submit:

Your gold image is created:

ga4

Before to subscribe an agent, we have to create a version for the gold agent image. In the Manage Images screen, we select Version and Drafts and we choose Create:

ga5

The gold agent image version is created as a draft:

ga6

You select Set Current Version, the status is now Current:

ga7

Now we can also install an agent using the gold agent image, from the Gold agent Image screen, we select Add Hosts:

ga8

You enter the hostname, the platform name and you choose to install With Gold Agent Image:

ga9

Then the classical agent installation is running fine. At the end we can display that our new agent is correctly subscribed to the gold image.

ga10

Now let’s patch the gold image agent, we apply the recently releaed agent pach 22568679 on the gold image:

Interim patches (2) :
 
Patch  22568679     : applied on Tue May 17 13:41:34 CEST 2016
Unique Patch ID:  19902660
   Created on 23 Feb 2016, 09:20:08 hrs PST8PDT
   Bugs fixed:
     22568679
 
Patch  18421945     : applied on Fri May 13 16:43:56 CEST 2016
Unique Patch ID:  18724707
  Created on 12 Aug 2015, 02:07:32 hrs PST8PDT
   Bugs fixed:
     18421945

We create a new version V2 and we can display the newly patch on this agent:

ga11

From now on, we have the gold agent image V2 on which the last agent patch has been applied. We have the possibility to update the agent on vmtestoradg2 to the last patch version. At first we unsubscribe this agent from the gold agent image V1, then we subscribe it to the gold agent image V2, then in the Subscription tab of the gold agent image, we select update to current version:

ga12

We select Next and the update process is launched:

ga13

The next screen displays the different options:

ga14

In this screen, we have a lot of possibilities, send Pre-Update or Post-Update scripts, receive mails when the update process is finished :=)

You can follow the update progress via the console or more practical via emcli:

oracle@vmtestoraCC13c:/home/oracle/psi/ [oms13c] emcli get_agent_update_status 
-op_name="GOLD_AGENT_IMAGE_UPDATE_2016_05_17_14_28_48_385"
 
Showing <AGENT_NAME, STATUS OF OPERATION, OPERATION START TIME, 
OPERATION END TIME, SEVERITY, REASON> 
for each agent in the operation GOLD_AGENT_IMAGE_UPDATE_2016_05_17_14_28_48_385
 
Total Agents                                      
Status             Started                Ended                                                 
------------        ------               -------                                          
Success       2016-05-17 12:30:23 GMT    2016-05-17 12:36:25 GMT  

Finally you agent is updated and has the same level of patch of your gold agent patch image:

oracle@vmtestoradg2:/home/oracle/ [agent13c] opatch lsinventory
Oracle Interim Patch Installer version 13.6.0.0.0
Copyright (c) 2016, Oracle Corporation.  All rights reserved.
Oracle Home       : /u00/app/oracle/agent13c/GoldImage_V2/agent_13.1.0.0.0
Central Inventory : /u00/app/oraInventory
   from           : /u00/app/oracle/agent13c/GoldImage_V2/
                    agent_13.1.0.0.0/oraInst.loc
OPatch version    : 13.6.0.0.0
OUI version       : 13.6.0.0.0
Log file location : /u00/app/oracle/agent13c/GoldImage_V2/
agent_13.1.0.0.0/cfgtoollogs/opatch/opatch2016-05-17_16-20-08PM_1.log
OPatch detects the Middleware Home as "/u00/app/oracle/agent13c/GoldImage_V2"
Lsinventory Output file location : /u00/app/oracle/agent13c/GoldImage_V2/
agent_13.1.0.0.0/cfgtoollogs/opatch/lsinv/lsinventory2016-05-17_16-20-08PM.txt
Local Machine Information::
Hostname: vmtestoradg2.it.dbi-services.com
ARU platform id: 226
ARU platform description:: Linux_AMD64
Interim patches (2) :
Patch  22568679     : applied on Tue May 17 13:41:34 CEST 2016
Unique Patch ID:  19902660
   Created on 23 Feb 2016, 09:20:08 hrs PST8PDT
   Bugs fixed:
     22568679
Patch  18421945     : applied on Fri May 13 16:43:56 CEST 2016
Unique Patch ID:  18724707
   Created on 12 Aug 2015, 02:07:32 hrs PST8PDT
Bugs fixed:
     18421945
OPatch succeeded.

Just one negative point, the agent home has changed and there are some consequences:

  • you have to de-install the old agent home, the old agent home was /u00/app/oracle/agent13c/agent_13.1.0.0 and the new one is now /u00/app/oracle/agent13c/GoldImageV2/agent_13.1.0.0.0
  • the agent home name is not free

ga15

All those operations can be realized with emcli:

emcli subscribe_agents -image_name=”gold_agent” -agents=”vmtestoradg2:3872″

or emcli update_agents -image_name=”gold_image” -agents=”vmtestoradg2:3872″

This new feature will ease the agent management on Enterprise Manager, using gold agent image will allow you to deploy or update the same agent at the most recent patch level on multiple hosts. I will recommend you to use emcli to realize all those operations.

 

 

 

 

 

 

Cet article EM13c Gold Agent Image est apparu en premier sur Blog dbi services.


Multitenant dictionary: what is stored only in CDB$ROOT?

$
0
0

Multitenant architecture is about dictionary separation. The idea is that all system metadata is stored only in CDB$ROOT so that space and upgrade time are optimized. Is it entirely true? let’s count the rows in the dictionary tables.

In order to verify that, I’ve build a query that will count the rows from the dictionary tables, in CDB$ROOT and in PDB$SEED.
The idea is to query DBA_OBJECTS for ORACLE_MAINTAINED=Y objects and call a function that run an ‘execute immediate’ to do a ‘select count(*)’. Inline functions in 12c are great for that. Especially when I want to switch to another container for it. Note that I’m not 100% sure that it’s supported to switch to another container there but al least don’t forget to switch back to initial one.
As I’m using some inline function, I’ve added one ‘hextoasc’ that helps me to lookup into the dictionary cache for the presence of object (not related to this post) and I also check which table is in the bootstrap procedure (which hard codes some dictionary metadata into row cache before they are available through buffer cache).

So here is the query:


with function countrows(con_name varchar2,name varchar2) return number as
n number;
saved_con_name varchar2(128);
begin
saved_con_name:=sys_context('userenv','con_name');
execute immediate 'alter session set container='||con_name;
execute immediate 'select count(*) from "'||name||'"' into n;
execute immediate 'alter session set container='||saved_con_name;
return n;
exception when others then
execute immediate 'alter session set container='||saved_con_name;
return null;
end;
function hex2asc(s varchar2) return varchar2 as r varchar2(32000);
begin
for i in 1..length(s)/2
loop
exit when substr(s,2*i-1,2)='00';
r:=r||chr(to_number(substr(s,2*i-1,2),'XX'));
end loop;
return r;
end;
select v.*
,(select count(*) from v$rowcache_parent where key like '00%' and hex2asc(substr(key,13)||'00') like object_name||'%' and existent='Y' and con_id=1) rowcache_entries
,(select substr(sql_text,1,30) from bootstrap$ where sql_text like '%TABLE '||object_name||'(%') bootstrap
from (
select object_name,countrows('PDB$SEED',object_name) SEED_COUNT,countrows('CDB$ROOT',object_name) ROOT_COUNT
from user_objects where object_type='TABLE' and oracle_maintained='Y' and object_name like '%$'
) v
where root_count>0 order by seed_count desc, root_count desc;
/

And the first rows of the result where sorting those that have lot of lines in PDB$SEED first:


OBJECT_NAME SEED_COUNT ROOT_COUNT ROWCACHE_ENTRIES BOOTSTRAP
------------------------------ ---------- ---------- --------------------------------------- ------------------------------
DEPENDENCY$ 162253 162180 2
COL$ 111623 111663 2 CREATE TABLE COL$("OBJ#" NUMBE
OBJ$ 91511 91721 2 CREATE TABLE OBJ$("OBJ#" NUMBE
OBJAUTH$ 45085 45084 2
HIST_HEAD$ 30585 50516 2
ACCESS$ 27351 109485 2
KOTAD$ 25927 27456 2
JAVASNM$ 25073 25073 2
HISTGRM$ 22653 72890 2
SETTINGS$ 19872 52936 2
SOURCE$ 17608 327589 2
ATTRIBUTE$ 13975 13975 2
PARAMETER$ 11785 11785 2
CCOL$ 11362 11400 2 CREATE TABLE CCOL$("CON#" NUMB
CON$ 9648 9686 2 CREATE TABLE CON$("OWNER#" NUM
CDEF$ 9493 9685 2 CREATE TABLE CDEF$("CON#" NUMB
METASCRIPTFILTER$ 7365 7365 2
IDL_SB4$ 7288 17787 2
ICOL$ 6432 6432 2 CREATE TABLE ICOL$("OBJ#" NUMB
IDL_UB1$ 6290 53505 2
IDL_UB2$ 5931 13029 2
OID$ 5119 6574 2
IND$ 4264 4264 2 CREATE TABLE IND$("OBJ#" NUMBE

You see immediately that the largest metadata, which is the source of the stored procedures, in SOURCE$, is mostly stored only in CDB$ROOT. For space efficiency this is good.

However you can see that COL$ and TAB$ have same number of rows in CDB$ROOT and in PDB, which is not exactly what is described in the oracle documentation.

And tables such as DEPENDENCY$, managing dependency among objects, is still huge in the PDB. Dependencies are managed at that level.

This explain why it still takes time to upgrade or patch a PDB when the CDB has been upgraded or patched. There is not only the metadata/data links to verify. There is still some DDL to run to maintain the pluggable database dictionary.

This is not exactly what is documented in https://docs.oracle.com/database/121/CNCPT/cdbovrvw.htm#CNCPT89242:
Fewer database patches and upgrades
It is easier to apply a patch to one database than to 100 databases, and to upgrade one database than to upgrade 100 databases.

But we can expect that this will be improved in further releases.

 

Cet article Multitenant dictionary: what is stored only in CDB$ROOT? est apparu en premier sur Blog dbi services.

Data Guard as a Service

$
0
0

A ‘Data Guard’ checkbox is available for a long time on the Oracle Public Cloud Database as a Service, but it’s only for a few days that it does something: create a service with an database in Data Guard.

I’ve created a database service as usual and the only additional thing I did is to check ‘Standby Database with Data Guard':

CaptureOPCDG001

Here are the attributes:

CaptureOPCDG002

And the creation starts. But in the progress message, you see two VMs:

CaptureOPCDG003

If you go to the Compute services, only one service is there, the first VM:

CaptureOPCDG004

But back to Database services, when you click on the service, you see the two nodes:

CaptureOPCDG005

It’s a full Data Guard configuration, with the broker automatically configured.

The latency is one millisecond on average which is very good and allows the standby to be in SYNC. Here are some wait events with 4000 small transactions per second (2MB redo per second):


Wait Event Histogram DB/Inst: CDB/CDB Snaps: 42-43
-> Total Waits - units: K is 1000, M is 1000000, G is 1000000000
-> % of Waits - column heading: <=1s is truly 1s is truly >=1024ms
-> % of Waits - value: .0 indicates value was Ordered by Event (idle events last)
 
Total ----------------- % of Waits ------------------
Event Waits <1ms <2ms <4ms <8ms <16ms <32ms 1s
-------------------------- ----- ----- ----- ----- ----- ----- ----- ----- -----
LGWR wait for redo copy 100K 99.1 .9 .1 .0 .0
LNS wait on LGWR 110K 100.0
Redo Transport MISC 110K 42.2 55.7 1.4 .5 .2 .0 .0
Redo Writer Remote Sync No 110K 100.0 .0 .0
SYNC Remote Write 110K 58.9 39.1 1.4 .4 .1 .0 .0
log file parallel write 332K 97.7 1.7 .3 .1 .1 .0 .0
log file sync 1552K 13.7 46.8 35.8 3.1 .5 .1 .0
-------------------------------------------------------------

Note that if you are not in Extreme Performance edition, you should take care not to open the standby or Active Data Guard usage will be recorded. On service startup, Oracle manages that with the following:
/etc/rc.d/rc.local run /home/oracle/dbsetup.sh which run dbstart (so the standby is open read only)
then it: shutdown abort and startup mount
then if it is Extreme Performance it open read only

Finally, if you wonder how the edition is checked, it’s from the service attributes available from 192.0.0.192:

[oracle@opccdb ~]$ links -dump http://192.0.0.192/latest/attributes/bundle
extreme-perf

As it is a managed database, you can Switchover, Failover and Reinstate from Cloud My Services interface:

CaptureOPCDG006

 

Cet article Data Guard as a Service est apparu en premier sur Blog dbi services.

SYS password on Oracle Cloud Service managed database

$
0
0

When you create a DBaaS on the Oracle Cloud services you have to provide an administration password in the database configuration form. You do not need a password to connect to the VM. You use SSH key for it: on creation you provide the public key that will allow you to connect as the oracle user or the opc user (which can ‘sudo su’). But for the database you need to provide a password which will be the password for SYS and SYSTEM. This password can be used by yourself, or by the DBaaS management. Let’s see how it is stored and how to change it.

I’m talking about managed DBaaS here, not virtual image, because in virtual image you create and administrate the database yourself.

CapturePass0

You have to define the ‘administrator password’ which will be used for SYS ans SYSTEM (the -sysPassword and -systemPassword parameters of DBCA). This password must obey the

CapturePass1

Where is the password?

When I provide a password, I don’t want it to be exposed, even within the system that is protected by this password. I may use the same password, or same pattern, for different systems and I don’t want any user to see my password in clear. In the database, the password is not stored. It is immediately hashed and password verification is done with the hashed value. In the orapw file, it is encrypted. If it has to be used by some programs, I expect to use a wallet.

I want to be sure that the password I’ve provided is not stored anywhere, let’s check:


[oracle@CDB-dg01 ~]$ grep -R Ach1z0 /var/opt/oracle 2>/dev/null
/var/opt/oracle/dg/observer.sh:connect sys/Ach1z0#d
/var/opt/oracle/ocde/assistants/dg/tmp/18267.odgda.json:{"Standby":[{"dg":{"fsfo_enabled":null,"vmpresent":"yes","spfile":{"db_unique_name":"CDB_02"},"protection_mode":null},"dborch":{"db_sid":"CDB","vm_sshkeys":"/home/oracle/.ssh/id_rsa","vm_name":"CDB-dg02-nat"}}],"Primary":{"dg":{"fsfo_enabled":null,"vmpresent":"yes","spfile":{"db_unique_name":"CDB_01"},"protection_mode":null},"dborch":{"db_version":"12102","db_sid":"CDB","vm_sshkeys":"/home/oracle/.ssh/id_rsa","vm_name":"CDB-dg01-nat","db_passwd":"Ach1z0#d"}},"tool_defaults":{"octl_cmd":"","tmpdir":"/var/opt/oracle/ocde/assistants/dg/tmp/","dborch_cmd":"","tools_dir":".."}}
/var/opt/oracle/log/dgcc/dgcc.log: 'db_passwd' => 'Ach1z0#d',
/var/opt/oracle/log/dgcc/dgcc_2016-05-20_06:49:28.log: 'db_passwd' => 'Ach1z0#d',
/var/opt/oracle/log/ocde/ocde-cmd.log: 'db_passwd' => 'Ach1z0#d',
/var/opt/oracle/log/ocde/ocde_2016-05-20_06:31:45.log: 'db_passwd' => 'Ach1z0#d',
/var/opt/oracle/log/ocde/ocde.log: 'db_passwd' => 'Ach1z0#d',
/var/opt/oracle/log/ocde/ocde-cmd_2016-05-20_06:31:45.log: 'db_passwd' => 'Ach1z0#d',
/var/opt/oracle/log/creg/creg.ini.CDB-dg02-nat:passwd=Ach1z0#d
/var/opt/oracle/creg.ini:passwd=Ach1z0#d

SYS password is exposed to everybody on the server!

That’s bad. Really bad. The most important database password is exposed in configuration files, script files and log files, and some of them are readable by everybody:


[oracle@CDB-dg01 ~]$ ls -l $(grep -lR Ach1z /var/opt/oracle 2>/dev/null)
-rw------- 1 oracle oinstall 2381 May 20 21:35 /var/opt/oracle/creg.ini
-rwxr-xr-- 1 oracle oinstall 139 May 20 07:03 /var/opt/oracle/dg/observer.sh
-rw------- 1 oracle oinstall 2353 May 20 06:49 /var/opt/oracle/log/creg/creg.ini.CDB-dg02-nat
-rw-r--r-- 1 oracle oinstall 3395 May 20 07:07 /var/opt/oracle/log/dgcc/dgcc_2016-05-20_06:49:28.log
lrwxrwxrwx 1 oracle oinstall 53 May 20 06:49 /var/opt/oracle/log/dgcc/dgcc.log -> /var/opt/oracle/log/dgcc/dgcc_2016-05-20_06:49:28.log
-rw-r--r-- 1 oracle oinstall 152448 May 20 07:11 /var/opt/oracle/log/ocde/ocde_2016-05-20_06:31:45.log
-rw-r--r-- 1 oracle oinstall 82856 May 20 07:11 /var/opt/oracle/log/ocde/ocde-cmd_2016-05-20_06:31:45.log
lrwxrwxrwx 1 oracle oinstall 57 May 20 06:34 /var/opt/oracle/log/ocde/ocde-cmd.log -> /var/opt/oracle/log/ocde/ocde-cmd_2016-05-20_06:31:45.log
lrwxrwxrwx 1 oracle oinstall 53 May 20 06:31 /var/opt/oracle/log/ocde/ocde.log -> /var/opt/oracle/log/ocde/ocde_2016-05-20_06:31:45.log
-rw-r--r-- 1 oracle oinstall 581 May 20 06:49 /var/opt/oracle/ocde/assistants/dg/tmp/18267.odgda.json

The scripts and logs used at creation can be removed. The script that starts observer should use a wallet. But more worrying is that creg.ini because this is where are all our service attributes. And if we change or remove the password there, the service is broken. Here is an example when I initiate a switchover from the CLOUD My Services interface after having removed the password there:

CapturePass2

Of course, you can still do a switchover from DGMGRL command line, but this defeats the whole purpose of a managed service.

Changing SYS password

When you want to change the SYS password, you must use the DBaaS tool. You must run ‘dbaascli database changepassword’. In Data Guard, you must run it from the primary site and it takes care of everything, including the copy of orapw file, because in 12.1 (not talking about Next Generation of Oracle Database here) you have to do it manually.


[oracle@CDB-dg02 ~]$ dbaascli database changepassword
DBAAS CLI version 1.0.0
Executing command database changepassword
Enter username whose password change is required: SYS
Enter new password:
Re-enter new password:
Unable to change password for sys

Humm.. let’s try with SYS in lowercase…


[oracle@CDB-dg02 ~]$ dbaascli database changepassword
DBAAS CLI version 1.0.0
Executing command database changepassword
Enter username whose password change is required: sys
Enter new password:
Re-enter new password:
Dataguard is enabled
Warning: Permanently added 'cdb-dg02-nat,140.86.4.51' (RSA) to the list of known hosts.
Dataguard is enabled
Dataguard is enabled
Dataguard is enabled
Successfully changed the password for user sys/system

Okay, users must be lower case here… Good to know… And from the message SYS and SYSTEM have been changed…

Wallet

You wonder why the password is in clear text in the configuration file? So do I. I would expect the use of external password file here. But wait… there is one:


[oracle@CDB-dg01 ~]$ grep WALLET $ORACLE_HOME/network/admin/sqlnet.ora
ENCRYPTION_WALLET_LOCATION = (SOURCE=(METHOD=FILE)(METHOD_DATA=(DIRECTORY=/u01/app/oracle/admin/CDB/tde_wallet)))
SQLNET.WALLET_OVERRIDE = FALSE
WALLET_LOCATION = (SOURCE=(METHOD_DATA=(DIRECTORY=/u01/app/oracle/admin/CDB/db_wallet))(METHOD=FILE))

and this wallet has credentials for SYS:

[oracle@CDB-dg01 db_wallet]$ mkstore -wrl /u01/app/oracle/admin/CDB/db_wallet -listCredential
Oracle Secret Store Tool : Version 12.1.0.2
Copyright (c) 2004, 2014, Oracle and/or its affiliates. All rights reserved.
 
List credential (index: connect_string username)
3: CDB_02 sys
2: CDB_01 sys
1: CDB sys

and those are entries to connect as SYS to each site:

[oracle@CDB-dg01 db_wallet]$ for i in CDB CDB_01 CDB_02 ; do tnsping $i ; done | grep DESC
Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP) (HOST = CDB-dg01) (PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = CDB.may.oraclecloud.internal)))
Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP) (HOST = CDB-dg01-nat) (PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SID = CDB)))
Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP) (HOST = CDB-dg02-nat) (PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SID = CDB)))

So this is a nice way to connect without providing the password, and without the password being visible… or is it?


[oracle@CDB-dg01 db_wallet]$ mkstore -wrl /u01/app/oracle/admin/CDB/db_wallet -viewEntry oracle.security.client.password1
Oracle Secret Store Tool : Version 12.1.0.2
Copyright (c) 2004, 2014, Oracle and/or its affiliates. All rights reserved.
 
oracle.security.client.password1 = Ach1z0#d
[oracle@CDB-dg01 db_wallet]$ mkstore -wrl /u01/app/oracle/admin/CDB/db_wallet -viewEntry oracle.security.client.password2
Oracle Secret Store Tool : Version 12.1.0.2
Copyright (c) 2004, 2014, Oracle and/or its affiliates. All rights reserved.
 
oracle.security.client.password2 = Ach1z0#d

Arghhh… password plain visible here from the oracle user.
When I create a wallet, I provide a password. And without the password we can use the credential but not display it. But here it seems that the wallet was created as ‘auto login’ with the ‘-createALO’ option. Whant to know more about that option? You can see it in online help:

[oracle@CDB-dg01 db_wallet]$ mkstore
Oracle Secret Store Tool : Version 12.1.0.2
Copyright (c) 2004, 2014, Oracle and/or its affiliates. All rights reserved.
 
mkstore [-wrl wrl] [-create] [-createSSO] [-createLSSO] [-createALO] [-delete] [-deleteSSO] [-list] [-createEntry alias secret] [-viewEntry alias] [-modifyEntry alias secret] [-deleteEntry alias] [-createCredential connect_string username password] [-listCredential] [-modifyCredential connect_string username password] [-deleteCredential connect_string] [-help] [-nologo]

But if you want more information about -createALO you have to check documentation. Well, is it documented?
You need to wait that Bug 21152979 : PROVIDE EXPLANATION FOR EACH OPTION OF MKSTORE UTILITY is fixed…

Conclusion

Ok, better to stop here. Lot of interesting things here about automation, but it is still far from what we did with the dbi services Database Management Kit for years.

With DBaaS it is easy to configure a database in Data Guard configuration with a few click. No doubt. But what is done here is very far from the best practices we follow when we setup a Data Guard configuration. Very good for a lab or test where security is not that important.

Do not put sensible passwords for the DBaaS configuration because they are exposed to everybody that can read files on the VM created. You may be tempted to open the database to developers, giving them high privileges because it’s an isolated environment in the Cloud, so no risk. But then you expose the SYS password, which may give some clue about the passwords patterns you use elsewhere.

 

Cet article SYS password on Oracle Cloud Service managed database est apparu en premier sur Blog dbi services.

DB_FLASHBACK_RETENTION_TARGET may hang your database

$
0
0

DB_FLASHBACK_RETENTION_TARGET is set to keep enough flashback logs to be able to flashback database within the specified retention window. But it’s supposed to be a target only, meaning that on space pressure some files can be deleted. But be careful, there are cases where they are not and then the database hangs until you set a lower retention.

The fun part is the message telling you that it cannot reclaim space from a 50GB FRA where 0% is used:

Less funny is the primary database hanging because its own FRA is full (deletion policy being APPLIED ON ALL STANDBY)

CaptureCDB02FLASHBACKRET

What happened

So, I’ve a Data Guard configuration where the deletion policy is ‘APPLIED ON ALL STANDBY’ and both sites have FLASHBACK ON.

I’ve no guaranteed restore points:

RMAN> list restore point all;
 
using target database control file instead of recovery catalog
SCN RSP Time Type Time Name
---------------- --------- ---------- --------- ----
 

and flashback retention is 1 day to allow a possible reinstate, or simple to allow to use the standby to flashback to recent point in time:

SQL> show parameter flashback
 
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_flashback_retention_target integer 1440

Here is the space usage of my standby database FRA when I have high activity on the primary:


SQL> select * from v$recovery_area_usage;
 
FILE_TYPE PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES CON_ID
----------------------- ------------------ ------------------------- --------------- ----------
CONTROL FILE 0 0 0 0
REDO LOG 0 0 0 0
ARCHIVED LOG 0 0 0 0
BACKUP PIECE .1 0 3 0
IMAGE COPY 0 0 0 0
FLASHBACK LOG 98.5 0 50 0
FOREIGN ARCHIVED LOG 0 0 0 0
AUXILIARY DATAFILE COPY 0 0 0 0

FLASHBACK LOG are filling the 50GB I have because I generate more than 50GB of changes per day. They are not set as reclaimable but I expected that they can be reclaimaible in case of space pressure on FRA because the retention is a target retention – not a guaranteed retention.

ARCHIVED LOG become reclaimable as soon as they are applied thanks to the deletion policy. And they are actually reclaimed because space is needed for flashback logs.

At that point, I can expect that when new redo is coming the archived logs can always be written because archived logs have priority on non-guaranteed flashback logs.

I forgot to tell you that the UNDO tablespace has not been created with RETENTION GUARANTEE:

SQL> select contents,retention from dba_tablespaces;
 
CONTENTS RETENTION
--------- -----------
PERMANENT NOT APPLY
PERMANENT NOT APPLY
UNDO NOGUARANTEE
TEMPORARY NOT APPLY
PERMANENT NOT APPLY

But actually, my standby database is in an archiver stuck situation:

Thu May 26 07:42:01 2016
Errors in file /u01/app/oracle/diag/rdbms/cdb_02/CDB/trace/CDB_arc1_5612.trc:
ORA-19815: WARNING: db_recovery_file_dest_size of 53687091200 bytes is 0.00% used, and has 53687091200 remaining bytes available.
Thu May 26 07:42:01 2016
************************************************************************
You have following choices to free up space from recovery area:
1. Consider changing RMAN RETENTION POLICY. If you are using Data Guard,
then consider changing RMAN ARCHIVELOG DELETION POLICY.
2. Back up files to tertiary device such as tape using RMAN
BACKUP RECOVERY AREA command.
3. Add disk space and increase db_recovery_file_dest_size parameter to
reflect the new space.
4. Delete unnecessary files using RMAN DELETE command. If an operating
system command was used to delete files, then use RMAN CROSSCHECK and
DELETE EXPIRED commands.
************************************************************************
Thu May 26 07:42:01 2016
Errors in file /u01/app/oracle/diag/rdbms/cdb_02/CDB/trace/CDB_arc1_5612.trc:
ORA-19809: limit exceeded for recovery files
ORA-19804: cannot reclaim 1073741824 bytes disk space from 53687091200 limit
Thu May 26 07:43:01 2016
Errors in file /u01/app/oracle/diag/rdbms/cdb_02/CDB/trace/CDB_arc1_5612.trc:
ORA-19815: WARNING: db_recovery_file_dest_size of 53687091200 bytes is 0.00% used, and has 53687091200 remaining bytes available.
Thu May 26 07:43:01 2016
************************************************************************
You have following choices to free up space from recovery area:
1. Consider changing RMAN RETENTION POLICY. If you are using Data Guard,
then consider changing RMAN ARCHIVELOG DELETION POLICY.
2. Back up files to tertiary device such as tape using RMAN
BACKUP RECOVERY AREA command.
3. Add disk space and increase db_recovery_file_dest_size parameter to
reflect the new space.
4. Delete unnecessary files using RMAN DELETE command. If an operating
system command was used to delete files, then use RMAN CROSSCHECK and
DELETE EXPIRED commands.
************************************************************************
Thu May 26 07:43:01 2016
Errors in file /u01/app/oracle/diag/rdbms/cdb_02/CDB/trace/CDB_arc1_5612.trc:
ORA-19809: limit exceeded for recovery files
ORA-19804: cannot reclaim 1073741824 bytes disk space from 53687091200 limit
Thu May 26 07:43:46 2016

This blocks the standby with a big gap and this may have bad consequence on primary availability and protection.

Workaround

The workaround is to lower the flashback retention target so that all changes fit in the FRA:

SQL> alter system set db_flashback_retention_target=60;

and as soon as I did it some flashback logs became reclaimable:

SQL> select * from v$recovery_area_usage;
 
FILE_TYPE PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES CON_ID
----------------------- ------------------ ------------------------- --------------- ----------
CONTROL FILE 0 0 0 0
REDO LOG 0 0 0 0
ARCHIVED LOG 0 0 0 0
BACKUP PIECE .1 0 3 0
IMAGE COPY 0 0 0 0
FLASHBACK LOG 96.5 68 49 0
FOREIGN ARCHIVED LOG 0 0 0 0
AUXILIARY DATAFILE COPY 0 0 0 0

Which you can see in the alert.log as they are deleted to reclaim space:


Thu May 26 07:43:46 2016
ALTER SYSTEM SET db_flashback_retention_target=60 SCOPE=BOTH;
Thu May 26 07:44:01 2016
Deleted Oracle managed file /u03/app/oracle/fast_recovery_area/CDB_02/flashback/o1_mf_cncs7qnx_.flb
Deleted Oracle managed file /u03/app/oracle/fast_recovery_area/CDB_02/archivelog/2016_05_26/o1_mf_0_0_cnfbb1b4_.arc

Conclusion

I’ll open a SR for it (easier to format in WordPress than in MOS). This is in 12cR1 multitenant with Patch Set Update : 12.1.0.2.160119
For the moment, the recommandation is: always monitor the FRA for (used-reclaimable) space even in a standby where archvielogs become reclaimable as soon as they are applied.

 

Cet article DB_FLASHBACK_RETENTION_TARGET may hang your database est apparu en premier sur Blog dbi services.

Extended clusters and asm_preferred_read_failure_groups

$
0
0

When you have 2 sites that are not too far you can build an extended cluster. You have one node on each site. And you can also use ASM normal redundancy to store data on each site (each diskgroup has a failure group for each site). Writes are multiplexed, so the latency between the two sites increases the write time. By default, reads can be done from one or the other site. But we can, and should, define that preference goes to local reads.

The setup is easy. In the ASM instance you list the failure groups that are on the same site, with the ‘asm_preferred_read_failure_groups’ parameter. You set that with an ALTER SYSTEM SCOPE=spfile SID=… because you will have different values for each instance. Of course, that supposes that you know the SID of the ASM instance that run on a specific site. If you are in Flex ASM, don’t ask. Wait 12.2 or read Bertrand Drouvot blog post

I’m on an extended cluster where the two sites have between 0.3 and 0.4 milliseconds of latency. I’m checking the storage with SLOB so this is the occasion to check how asm_preferred_read_failure_groups helps in I/O latency.

I use a simple SLOB configuration for physical I/O, read only, single block, and check the wait event histogram for ‘db file sequential read’.
Here is an example of output:

EVENT WAIT_TIME_MICRO WAIT_COUNT WAIT_TIME_FORMAT
------------------------------ --------------- ---------- ------------------------------
db file sequential read 1 0 1 microsecond
db file sequential read 2 0 2 microseconds
db file sequential read 4 0 4 microseconds
db file sequential read 8 0 8 microseconds
db file sequential read 16 0 16 microseconds
db file sequential read 32 0 32 microseconds
db file sequential read 64 0 64 microseconds
db file sequential read 128 0 128 microseconds
db file sequential read 256 538 256 microseconds
db file sequential read 512 5461 512 microseconds
db file sequential read 1024 2383 1 millisecond
db file sequential read 2048 123 2 milliseconds
db file sequential read 4096 148 4 milliseconds
db file sequential read 8192 682 8 milliseconds
db file sequential read 16384 3777 16 milliseconds
db file sequential read 32768 1977 32 milliseconds
db file sequential read 65536 454 65 milliseconds
db file sequential read 131072 68 131 milliseconds
db file sequential read 262144 6 262 milliseconds

It seems that half of the reads are served by the array cache and the other half are above disk latency time.

Now I set the asm_preferred_read_failure_groups to the remote site, to measure reads coming from there.

alter system set asm_preferred_read_failure_groups='DATA1_MIR.FAILGRP_SH' scope=memory;

and here is the result on similar workload:

EVENT WAIT_TIME_MICRO WAIT_COUNT WAIT_TIME_FORMAT
------------------------------ --------------- ---------- ------------------------------
db file sequential read 1 0 1 microsecond
db file sequential read 2 0 2 microseconds
db file sequential read 4 0 4 microseconds
db file sequential read 8 0 8 microseconds
db file sequential read 16 0 16 microseconds
db file sequential read 32 0 32 microseconds
db file sequential read 64 0 64 microseconds
db file sequential read 128 0 128 microseconds
db file sequential read 256 0 256 microseconds
db file sequential read 512 5425 512 microseconds
db file sequential read 1024 6165 1 millisecond
db file sequential read 2048 150 2 milliseconds
db file sequential read 4096 89 4 milliseconds
db file sequential read 8192 630 8 milliseconds
db file sequential read 16384 3598 16 milliseconds
db file sequential read 32768 1903 32 milliseconds
db file sequential read 65536 353 65 milliseconds
db file sequential read 131072 36 131 milliseconds
db file sequential read 262144 0 262 milliseconds
db file sequential read 524288 1 524 milliseconds

The pattern is similar except that I’ve nothing lower than 0.5 millisecond. I/Os served by the storage cache have there the additional latency of 0.3 milliseconds from the remote site. Of course, when we are above the millisecond, we don’t see the difference.

Now let’s set the right setting where preference should go to local reads:

alter system set asm_preferred_read_failure_groups='DATA1_MIR.FAILGRP_VE' scope=memory;

and the result:

EVENT WAIT_TIME_MICRO WAIT_COUNT WAIT_TIME_FORMAT
------------------------------ --------------- ---------- ------------------------------
db file sequential read 1 0 1 microsecond
db file sequential read 2 0 2 microseconds
db file sequential read 4 0 4 microseconds
db file sequential read 8 0 8 microseconds
db file sequential read 16 0 16 microseconds
db file sequential read 32 0 32 microseconds
db file sequential read 64 0 64 microseconds
db file sequential read 128 0 128 microseconds
db file sequential read 256 1165 256 microseconds
db file sequential read 512 9465 512 microseconds
db file sequential read 1024 519 1 millisecond
db file sequential read 2048 184 2 milliseconds
db file sequential read 4096 227 4 milliseconds
db file sequential read 8192 705 8 milliseconds
db file sequential read 16384 3350 16 milliseconds
db file sequential read 32768 1743 32 milliseconds
db file sequential read 65536 402 65 milliseconds
db file sequential read 131072 42 131 milliseconds
db file sequential read 262144 1 262 milliseconds

Here the fast reads are around 0.5 millisecond. And one thousand reads had a service time lower than 0.3 milliseconds, which was not possible when reading from the remote site.

Here is the pattern in in an Excel chart where you see no big difference for latency above 4 milliseconds.

CapturePrefFailGrp

With efficient storage array, extended cluster latency may penalize performance of writes. However, writes should be asynchronous (DBRW) so the latency is not part of the user response time. I’m not talking about redo logs here. For redo you have to choose to put it on a local only diskgroup or on a mirrored one. This depends on availability requirements and latency between the two sites.

So, when you have non uniform latency among failure groups, don’t forget to set asm_preferred_read_failure_groups. And test it with SLOB as I did here. Wat you expect from theorical latencies should be visible in the wait event histogram.

 

Cet article Extended clusters and asm_preferred_read_failure_groups est apparu en premier sur Blog dbi services.

Learning and troubleshooting: Follow the path

$
0
0

You query a simple table to get its rows. Did you ever ask yourself how oracle knows which blocks to read from disk? Let’s question everything and follow the path to dictionary, bootstrapping, spfile,… up to GPnP profile.

If you want to read the rows that are stored in a table, you have to read from the segment extents.

Segment Extents

How do you find those extents? You need to get the extent list from the tablespace header, and some information from the segment header

Where is the segment header? This is recorded in the dictionary. For a table it is in SYS.TAB$

Here is the definition that you can see in ?/rdbms/admin/dcore.bsq which is run at CREATE DATABASE time

create table tab$ /* table table */
( obj# number not null, /* object number */
dataobj# number, /* data layer object number */
ts# number not null, /* tablespace number */
file# number not null, /* segment header file number */
block# number not null, /* segment header block number */

TAB$ has all table definition and the TS#, FILE# and BLOCK# identifies uniquely the block where the segment header is. TS# identifies the tablespace. FILE# identifies the datafile within the tablespace. And BLOCK# is the offset within that file (there is only one block size for a tablespace). This is the physical identifier of a segment.

Let’s take SCOTT.EMP as an example


SQL> select object_id,object_type,data_object_id from dba_objects where owner='SCOTT' and object_name='EMP';
 
OBJECT_ID OBJECT_TYPE DATA_OBJECT_ID
---------- ----------------------- --------------
121515 TABLE 121533

I get segment header from TAB$


SQL> select ts#,file#,block# from sys.tab$ where obj#=121515;
 
TS# FILE# BLOCK#
---------- ---------- ----------
4 6 458

The extents listed in the tablespace header can be displayed from X$KTFBUE fixed table (if it were a Dictionary Managed Tablespace this information would be in the dictionary table UET$)


SQL> select ktfbuefno,ktfbuebno,ktfbueblks from x$ktfbue where ktfbuesegtsn=4 and ktfbuesegfno=6 and ktfbuesegbno=458;
 
KTFBUEFNO KTFBUEBNO KTFBUEBLKS
---------- ---------- ----------
6 456 8

From there we know that SCOTT.EMP data is stored in 8 blocks starting from block number 456 in file 6. This file number is relative to the tablespace, which is the tablespace number 4 because all segment extents are in the same tablespace as the segment header.

BOOTSTRAP$

Ok, so you can get everything from the dictionary, starting from TAB$. But it’s a table. So if you want to read it you need to read blocks from it’s extents. Where are those extents?

Easy, TAB$ itself has information in TAB$


SQL> select obj# from obj$ where name='TAB$';
 
OBJ#
----------
4
 
SQL> select ts#,file#,block# from sys.tab$ where obj#=4;
 
TS# FILE# BLOCK#
---------- ---------- ----------
0 1 144

Information is there: segment header is in tablespace 0 (which is SYSTEM) file 1 and block 144. And we can get extents from x$ktfbue where ktfbuesegtsn=0 and ktfbuesegfno=1 and ktfbuesegbno=144

But wait a minute… Am I saying that in order to get TAB$ data you need first to get TAB$ data? This is not possible. It’s a Catch 22 here. TAB$ metadata about itself must be available before being able to read the table. This core dictionary table information must be hardcoded and this is done by special bootstrapping code. The dcore.bsq above is not a normal SQL script, but a Bootstrap SQL which has a special syntax. When those core dictionary objects are created, by CREATE DATABASE, their extents go into an hardcoded location in the SYSTEM datafile.

And when an instance opens a database, the metadata about them is hardcoded in order to have the minimal information to be able to find the other dictionary information. This is done by pre-filling the dictionary cache with some bootstrap code. And this code is visible:


SQL> select * from bootstrap$ where sql_text like 'CREATE TABLE TAB$%';
 
LINE# OBJ# SQL_TEXT
---------- ---------- ---------------------------------------------------------------------------------------------------------------------------------
4 4 CREATE TABLE TAB$("OBJ#" NUMBER NOT NULL,"DATAOBJ#" NUMBER,"TS#" NUMBER NOT NULL,"FILE#" NUMBER NOT NULL,"BLOCK#" NUMBER NOT NULL
,"BOBJ#" NUMBER,"TAB#" NUMBER,"COLS" NUMBER NOT NULL,"CLUCOLS" NUMBER,"PCTFREE$" NUMBER NOT NULL,"PCTUSED$" NUMBER NOT NULL,"INIT
RANS" NUMBER NOT NULL,"MAXTRANS" NUMBER NOT NULL,"FLAGS" NUMBER NOT NULL,"AUDIT$" VARCHAR2(38) NOT NULL,"ROWCNT" NUMBER,"BLKCNT"
NUMBER,"EMPCNT" NUMBER,"AVGSPC" NUMBER,"CHNCNT" NUMBER,"AVGRLN" NUMBER,"AVGSPC_FLB" NUMBER,"FLBCNT" NUMBER,"ANALYZETIME" DATE,"SA
MPLESIZE" NUMBER,"DEGREE" NUMBER,"INSTANCES" NUMBER,"INTCOLS" NUMBER NOT NULL,"KERNELCOLS" NUMBER NOT NULL,"PROPERTY" NUMBER NOT
NULL,"TRIGFLAG" NUMBER,"SPARE1" NUMBER,"SPARE2" NUMBER,"SPARE3" NUMBER,"SPARE4" VARCHAR2(1000),"SPARE5" VARCHAR2(1000),"SPARE6" D
ATE) STORAGE ( OBJNO 4 TABNO 1) CLUSTER C_OBJ#(OBJ#)

The code to create TAB$ is there with additional bootstrapping SQL syntax to hardcoded the OBJECT_ID. This table is actually stored in a CLUSTER segment, and its definition is also harcoded in bootstrapping code:


SQL> select * from bootstrap$ where sql_text like 'CREATE CLUSTER C_OBJ#%';
 
LINE# OBJ# SQL_TEXT
---------- ---------- --------------------------------------------------------------------------------
2 2 CREATE CLUSTER C_OBJ#("OBJ#" NUMBER) PCTFREE 5 PCTUSED 40 INITRANS 2 MAXTRANS 25
5 STORAGE ( INITIAL 136K NEXT 200K MINEXTENTS 1 MAXEXTENTS 2147483645 PCTINCREA
SE 0 OBJNO 2 EXTENTS (FILE 1 BLOCK 144)) SIZE 800

File 1 Block 144 this is exactly what we have seen when querying TAB$ but actually it’s an hardcoded value. You will find exactly the same in all Oracle Database since 8.0

It seems that it was a different value in Oracle 7:
CaptureBOOTSTRAP
I don’t remember that upgrade to 8.0 had to update the first blocks of SYSTEM datafiles, but it was probably the case. If anyone has a database that has been upgraded since Oracle 7 then please tell me, but that should be rare nowadays. But for sure the code that warms up the dictionary cache must be consistent with how those segments are stored in the SYSTEM tablespace.

Controlfile

Let’s continue. We know how to read data as soon as the database is opened because all required information is in the SYSTEM datafile.

But how do we know where is this datafile? The controlfile knows the location of all datafiles and this information is available since the database mount stage

Here is an example after alter database backup controlfile to trace;


CREATE CONTROLFILE REUSE DATABASE "RACDB" NORESETLOGS NOARCHIVELOG
MAXLOGFILES 192
MAXLOGMEMBERS 3
MAXDATAFILES 1024
MAXINSTANCES 32
MAXLOGHISTORY 292
LOGFILE
GROUP 1 '+DATA/RACDB/ONLINELOG/group_1.262.906247575' SIZE 50M BLOCKSIZE 512,
GROUP 2 '+DATA/RACDB/ONLINELOG/group_2.263.906247575' SIZE 50M BLOCKSIZE 512,
GROUP 3 '+DATA/RACDB/ONLINELOG/group_3.264.906247575' SIZE 50M BLOCKSIZE 512,
GROUP 4 '+DATA/RACDB/ONLINELOG/group_4.274.906299789' SIZE 50M BLOCKSIZE 512,
GROUP 5 '+DATA/RACDB/ONLINELOG/group_5.275.906299789' SIZE 50M BLOCKSIZE 512,
GROUP 6 '+DATA/RACDB/ONLINELOG/group_6.276.906299789' SIZE 50M BLOCKSIZE 512
-- STANDBY LOGFILE
DATAFILE
'+DATA/RACDB/DATAFILE/system.258.906247493',
'+DATA/RACDB/DATAFILE/sysaux.257.906247463',
'+DATA/RACDB/DATAFILE/undotbs1.260.906247529',

Now the question is:
Where is the controlfile? This information is known from the instance parameter:


SQL> show parameter control_files
 
NAME TYPE VALUE
------------------------------------ ----------- ---------------------------------------------
control_files string +DATA/RACDB/CONTROLFILE/current.261.906247573

Oh good. I’m in ASM so I can continue my path, with question such as: how do we find the controlfile?

But before that, where this instance parameter comes from? SPFILE is read at instance startup.

Where is the SPFILE?

Database SPFILE


SQL> show parameter spfile
 
NAME TYPE VALUE
------------------------------------ ----------- ----------------------------------------------
spfile string +DATA/RACDB/PARAMETERFILE/spfile.269.906247759

In single instance, the SPFILE is found in $ORACLE_HOME/dbs but here I’m in RAC and the database resource has this information:

[oracle@racp1vm1 ~]$ srvctl config database -db RACDB
Database unique name: RACDB
Database name: RACDB
Oracle home: /u01/app/oracle/product/12.1.0/dbhome_1
Oracle user: oracle
Spfile: +DATA/RACDB/PARAMETERFILE/spfile.269.906247759

ASM

So in order to find the datafiles or the SPFILE you must have access to the +DATA diskgroup. The information about it is available from the ASM instance and this is easy to find because you have only one asm instance on the server.

But now, let’s go to the ASM instance. It stores all metadata in the disks that are accessible once the instance has started. What does it need to start and to find those disks?
The ASM instance has a SPFILE and information from the SPFILE is mandatory to access the diskgroups. So where is the SPFILE of the ASM instance?


[grid@racp1vm1 ~]$ asmcmd spget
+CRS_DG/ws-dbi-scan1/ASMPARAMETERFILE/registry.253.905527691

Okay… another Catch 22 here. the ASM SPFILE is stored in ASM and you need an ASM instance to access to it… but starting an ASM instance needs SPFILE information…

GPnP profile

I’ll not go into details which are very well explained by Anju Garg in her blog post http://oracleinaction.com/asm-spfile-on-asm/ and Robert Bialek one referenced at the end of it.


[grid@racp1vm1 ~]$ gpnptool get -o- | xmllint --format - | grep SPFile
 
Success.
<orcl:ASM-Profile id="asm" DiscoveryString="/dev/mapper/*" SPFile="+CRS_DG/ws-dbi-scan1/ASMPARAMETERFILE/registry.253.905527691" Mode="remote"/>

Everything is there. The GPnP profile has the ASM discovery string that lists the system disks. They are scanned at cluster start and all bootstrap information is found in the bootstrap header.

When you change the asm_diskstring from the ASM instance, or when you change the SPFILE location from asmcmd, the GPnP profile is updated. And if it is corrupted, the cluster doesn’t start.

I used ‘gnpnptool get’ but that doesn’t tell us where is the GPnP profile stored.


[grid@racp1vm1 ~]$ ls $ORACLE_HOME/gpnp/$HOSTNAME/profiles/peer/profile.xml
/u01/app/12.1.0/grid/gpnp/racp1vm1/profiles/peer/profile.xml

Here is the content, but you need gpnptool to change it because it is signed.

[grid@racp1vm1 ~]$ cat $ORACLE_HOME/gpnp/$HOSTNAME/profiles/peer/profile.xml
<?xml version="1.0" encoding="UTF-8"?><gpnp:GPnP-Profile Version="1.0" xmlns="http://www.grid-pnp.org/2005/11/gpnp-profile" xmlns:gpnp="http://www.grid-pnp.org/2005/11/gpnp-profile" xmlns:orcl="http://www.oracle.com/gpnp/2005/11/gpnp-profile" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.grid-pnp.org/2005/11/gpnp-profile gpnp-profile.xsd" ProfileSequence="9" ClusterUId="3db40dc75a3aef58bf2f0b71e011b137" ClusterName="ws-dbi-scan1" PALocation=""><gpnp:Network-Profile><gpnp:HostNetwork id="gen" HostName="*"><gpnp:Network id="net1" IP="192.168.22.0" Adapter="bond0" Use="public"/><gpnp:Network id="net2" IP="10.1.1.0" Adapter="enp0s10" Use="cluster_interconnect,asm"/></gpnp:HostNetwork></gpnp:Network-Profile><orcl:CSS-Profile id="css" DiscoveryString="+asm" LeaseDuration="400"/><orcl:ASM-Profile id="asm" DiscoveryString="" SPFile="+CRS_DG/ws-dbi-scan1/ASMPARAMETERFILE/registry.253.905527691" Mode="remote"/><ds:Signature xmlns:ds="http://www.w3.org/2000/09/xmldsig#"><ds:SignedInfo><ds:CanonicalizationMethod Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"/><ds:SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#rsa-sha1"/><ds:Reference URI=""><ds:Transforms><ds:Transform Algorithm="http://www.w3.org/2000/09/xmldsig#enveloped-signature"/><ds:Transform Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"> <InclusiveNamespaces xmlns="http://www.w3.org/2001/10/xml-exc-c14n#" PrefixList="gpnp orcl xsi"/></ds:Transform></ds:Transforms><ds:DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/><ds:DigestValue>3MkzWCCTRYJ1FDJ4h8G6PHrgfQQ=</ds:DigestValue></ds:Reference></ds:SignedInfo><ds:SignatureValue>XQtNrjbazMkfCO1e52scpC8y3tdpVbyFxWPPXirbZOmZ+ajcnAOD85qMJUPBaXG8G2sLCWVX5ir+Reo5f0ewyHCtzpGud9IWoYhb01T2W0o4WYzFFcwncxHDWDBCdLiKdSBOEJytRMCufgfciA/v6nzxWzDRS/7svWzG7shVzpI=</ds:SignatureValue></ds:Signature></gpnp:GPnP-Profile>

/etc/oratab

I used $ORACLE_HOME which is the Grid infrastructure home. If you’re not sure where it is, look at the +ASM entry in /etc/oratab:

[grid@racp1vm1 ~]$ grep +ASM /etc/oratab
+ASM1:/u01/app/12.1.0/grid:N # line added by Agent

So, with $HOSTNAME and /etc/oratab you can follow the path and understand how oracle can get any information from the system: configuration, metadata, data. Of course you can continue your questions at the system level: how the disks are opened and find your way with devices, multipathing,…

Conclusion

Even when a system is complex (and Grid Infrastructure / RAC is complex) you can follow the path and understand where information comes from. There’s no magic. There’s no black box. Everything can be understood. It may take time reading documentation, reading logs, testing, tracing,… But this is how you can understand exactly how it works and thus be prepared to troubleshoot any issue. As an example, in a lab (such as the racattack one) you can try to mess-up some ASM SPFILE parameters, for example change asm_diskstring to some random characters, and try to restart the cluster. If you don’t know how it works, you may spend a long time before fixing the issue, and the risk is that you break it even more. If you have read and understood the above then you will know exactly what has to be fixed: if the GPnP profile can’t find the disks then nothing can go further.

 

Cet article Learning and troubleshooting: Follow the path est apparu en premier sur Blog dbi services.

Where is the ASM SPFILE?

$
0
0

In the previous blog post, to the question “where is the ASM SPFILE” I answered by running ‘asmcmd spget’. But this is available only if the ASM instance is started. If you have corrupted your SPFILE (and it’s easy to do it) you can start with a local PFILE and the CREATE SPFILE= FROM PFILE. But then you get the location of the local one you used. If you need to know the shared location, let’s see the different places where you can find it.

First I set the environment for the ASM instance.

[grid@racp1vm1 ~]$ . oraenv <<< +ASM1
ORACLE_SID = [grid] ? The Oracle base has been set to /u01/app/grid

If the ASM instance is up you can get the SPFILE location from ASMCMD:

[grid@racp1vm1 ~]$ asmcmd spget
+CRS_DG/ws-dbi-scan1/ASMPARAMETERFILE/registry.253.905527691

or from SQL*Plus:

[grid@racp1vm1 ~]$ sqlplus -s / as sysasm
show parameter spfile
 
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
spfile string +CRS_DG/ws-dbi-scan1/ASMPARAME
TERFILE/registry.253.905527691

If the ASM instance us up but started with another SPFILE (or PFILE) you can search for a SPFILE in all diskgroups:

[grid@racp1vm1 ~]$ asmcmd find --type PARAMETERFILE '*' '*'
+CRS_DG/_MGMTDB/PARAMETERFILE/spfile.268.905529015
+DATA/RACDB/PARAMETERFILE/spfile.269.906247759
+DATA/RACDB/spfileRACDB.ora

But if ASM instance is not up, let’s see where we can find its location.

From GPnP tool

The cluster needs to know the SPFILE location in order to start the ASM instance, and for this purpose, location is stored in the GPnP profile

[grid@racp1vm1 ~]$ gpnptool get -o- | xmllint --format - | grep SPFile
Success.
<orcl:ASM-Profile id="asm" DiscoveryString="" SPFile="+CRS_DG/ws-dbi-scan1/ASMPARAMETERFILE/registry.253.905527691" Mode="remote"/>

Of course you can get it directly from the GPnP profile xml file


[grid@racp1vm1 ~]$ xmllint --shell $ORACLE_HOME/gpnp/$HOSTNAME/profiles/peer/profile.xml <<< "cat //*[@SPFile]/@SPFile"
SPFile="+CRS_DG/ws-dbi-scan1/ASMPARAMETERFILE/registry.253.905527691"

If you messed-up the GPnP profile, you may want to find the previous value.

From alert.log

If the ASM instance has been startup previously, which is probably the case, you can see the location of SPFILE in the alert.log:


[grid@racp1vm1 ~]$ adrci exec="set home +asm ; show alert -tail 1000" | grep "parameter setting"
Using parameter settings in server-side spfile +CRS_DG/ws-dbi-scan1/ASMPARAMETERFILE/registry.253.905527691
Using parameter settings in server-side spfile +CRS_DG/ws-dbi-scan1/ASMPARAMETERFILE/registry.253.905527691
Using parameter settings in server-side spfile +CRS_DG/ws-dbi-scan1/ASMPARAMETERFILE/registry.253.905527691
Using parameter settings in server-side spfile +CRS_DG/ws-dbi-scan1/ASMPARAMETERFILE/registry.253.905527691

$ORACLE_HOME/dbs

What is important to know is that there is nothing in $ORACLE_HOME/dbs

[grid@racp1vm1 ~]$ ls $ORACLE_HOME/dbs
ab_+ASM1.dat hc_+APX1.dat hc_+ASM1.dat hc_-MGMTDB.dat id_+ASM1.dat init-MGMTDB.ora init.ora lk_MGMTDB

and the ASM instance startup looks up there only if not found from the GPnP profile.

Backup the ASM SPFILE

It’s always a good idea to backup the SPFILE, in the same way you backup the OCR, and GPnP profile.

[grid@racp1vm1 ~]$ asmcmd spbackup +CRS_DG/ws-dbi-scan1/ASMPARAMETERFILE/registry.253.905527691 /tmp/asmspfile.bak

 

Cet article Where is the ASM SPFILE? est apparu en premier sur Blog dbi services.


EM 13c corrective actions

$
0
0

With Enterprise Manager 13c, we have the possibility to define corrective actions. Let me show you how to use it.

From the Enterprise à Monitoring select Corrective Actions :

ca1

Choose Add Space to Tablespace in the Create Library Corrective Action :

ca2

Enter the name, a description and choose Metric Alert as event:

ca3

Finally choose to Save to Library

ca4

You have to publish your corrective action as it has been created as a draft version, you select Publish.

You can edit the Add Space corrective action parameters to adapt it to your environment:

ca5

Now we create a tablespace and we affect the Corrective Action to the Metric Alert Tablespace Used. In the Oracle Database à Monitoring à Metric and Collection Settings

ca6

We edit the metric Tablespace Full, Tablespace Space Used (%):

ca7

In the Monitored Objects, we select Edit:

ca8

We select Add in the Warning Corrective Actions:

ca9

Enter a name, database and host credentials and choose the Add Space to Tablepace corrective actions.

Your corrective action is applied in the Tablespace Space Used. So if we insert a lot of data in our 5M size tablespace, the datafile must increase automatically. Let’s make the test:

ca10

Connected as a user with the default tablespace PSI, we insert some values :

SQL> create table test as select * from sys.dba_segments where rownum < 100;

Table created.

After some inserted values, the PSI tablespace is full, the corrective action has not yet run because its collection scheduled run every 30 minutes, we modify it to 5 minutes:

ca11

You also have the possibility to submit the job if you do not want to wait:

ca12

You can display how the job is running:

ca13

The job is done, the tablespace PSI has been increased automatically :=)

ca14

This corrective action allows to increase a tablespace size, but if we consider that in many Oracle production environment the data files are in auto extent mode on, this feature might not be very useful.

But we could use this corrective action to back up the database archive log when the archive log directory is filled more than the 80 % threshold.

In the Monitoring à Corrective Actions you select RMAN script and choose Go:

ca15

You enter the name , a shot description and you choose the event type Metric Alert

ca16

In the parameter tab, you enter your script to backup the archive logs:

ca17

Then for my test database in the Metric and Collection Settings, we select the metric Archive Area Used, and we add the corrective action backup archive log:

ca18

ca19

Now I modify the warning threshold for test purpose, I generate activity on the database to generate archive logs, the job is successfully launched:

ca20

This functionality is quite easy to implement and quite more practical for Oracle DBAs, specially to backup archive logs.

 

Cet article EM 13c corrective actions est apparu en premier sur Blog dbi services.

EM13c and postgres plugin

$
0
0

As an Oracle DBA I use Enterprise Manager to monitor Oracle databases at my client’s sites. I also administer more and more Postgres databases. I decided to download the Blue Medora plugin in order to monitor the postgresql databases with Enterprise Manager 13c, and to avoid to have different monitoring tools for each kinf of database.

Once you have downloaded the Blue Medora plugin for EM13c, you have to unzip it:

oracle@vmtestoraCC13c:/home/oracle/psi/ [rdbms12102] unzip bm.em.xpgs_12.1.0.2.0_2.0.3.zip
Archive:  bm.em.xpgs_12.1.0.2.0_2.0.3.zip
  inflating: bm.em.xpgs_12.1.0.2.0_2.0.3.opar
......

Then like other plugins, you have to import it in the OMS (Oracle Management Server):

oracle@vmtestoraCC13c:/home/oracle/psi/ [oms13c] emcli import_update
 -file=/home/oracle/psi/bm.em.xpgs_12.1.0.2.0_2.0.3.opar -omslocal
Processing update: Plug-in - PostgreSQL Monitoring
Successfully uploaded the update to Enterprise Manager. 
Use the Self Update Console to manage this update.

Once imported, from the Extensibility plugin page, we can display the PostgreSQL plugin:

pp1

We have to deploy the plugin to the OMS:

pp2

The next step consists now to deploy the plugin on an agent where there is an existing postgres environment, we select the adequate agent and we select Continue:

pp3

The postgresql plugin is successfully deployed to the agent:

pp4

Once the plugin is deployed, we need to add a postgres database target. I used the Add Target Declaratively and choosed Postgresql Database as Target Type:

pp5

In the next screen you enter the target name and useful informations such as hostname, login, password and port number:

pp6

Finally the postgresql database target has been added successfully:

pp7

Once the configuration phase finished, let’s have a look over how EM13c works with postgresql !

The newly postgresql database target exist in EM13c:

pp8

We have the same menus as we are used for Oracle database targets:

pp18

Now let’s try to create a monitoring template for postgresql database. From the EM13c Monitoring template menu, choose create:

pp9

Choose Target Type and select PosgreSQL database:

pp10

Enter the monitoring template name and a description:

pp11

You have the possibility to define thresholds for metrics:

pp12

Select Ok and the DBI_DB_POSTGRES is created.

Finally we apply the template to our postgres database as follows, we select the postgressql database and we select Apply:

pp13

The Home postgres EM13c page displays more informations:

pp14

The postgresql monitoring offers a lot of monitoring solutions:

pp161

 

We have the possibility to diplay the table or indices details:

pp15

Another good point is the possibility to use the reports. For example the target availability:

pp17

You can also create incident rules in the same way you did it for Oracle databases. Creating incident rules about Availability, Metric Alert or Metric Errors allows you the possibility to have incidents created and to receive alerts when something goes wrong in your posgresql database.

pp16

Once you have correctly configured your incident rules and their events, when you stop your postgres database, an incident is created and you receive a mail :=)

pp19

Deploying the Blue Medora’s plugin in order to administer your postgresql databases withe Enterprise Manager 13c will help you to administer heteregeneous databases environment. Furthermore you can use monitoring templates and incident rules.

 

 

 

 

 

 

 

 

 

 

 

 

 

Cet article EM13c and postgres plugin est apparu en premier sur Blog dbi services.

Oracle on Windows Server Core

$
0
0

If SQL Server, the database for GUI fans, goes to Linux then Oracle, the database for command line fans, can go to Windows. Ok, that’s not new. But I’m not talking about the Windows with media player, animated icons, and The Microsoft Hearts Network. A real server: Windows Server Core. Actually, this is the first time I run something on Windows Core. Is it only command line? Is it hard to set? Do we have to leave those amazing GUIs like netmgr? Let’s try.

The installation starts as another Windows Server 2012R2 installation
VirtualBox_Windows 2012 Server Core_10_06_2016_14_00_02

The beauty of Woindows. There is one button. You have a mouse. Just click on that central button.

VirtualBox_Windows 2012 Server Core_10_06_2016_14_00_14

I choose the ‘Core’ installation, with the hope I’ll not need PowerShell to configure it.
VirtualBox_Windows 2012 Server Core_10_06_2016_14_00_57
I’m installing on brand new VirtualBox VM
VirtualBox_Windows 2012 Server Core_10_06_2016_14_02_36
I provision a 30GB disk but need only half of them.
VirtualBox_Windows 2012 Server Core_10_06_2016_14_02_42
Installation is going…
VirtualBox_Windows 2012 Server Core_10_06_2016_14_09_16
And I’m ready to log in.
VirtualBox_Windows 2012 Server Core_10_06_2016_14_17_07

This is the end of the GUI: I’ve only a command line window.

VirtualBox_Windows 2012 Server Core_10_06_2016_14_19_53

I have to setup the network. Easy thanks to sconfig:

VirtualBox_Windows 2012 Server Core_10_06_2016_15_00_25

So… How will I install Oracle? On linux, I can run the runInstaller in silent mode. But here it’s setup.exe

VirtualBox_Windows 2012 Server Core_10_06_2016_15_02_07

Great. Windows Core Server has no GUI for the system but GUI applications can run !

VirtualBox_Windows 2012 Server Core_10_06_2016_15_29_51

I can install the software. and then run DBUA

VirtualBox_Windows 2012 Server Core_10_06_2016_15_59_27

Let’s start the listener

VirtualBox_Windows 2012 Server Core_10_06_2016_16_49_42

and connect to it (I had to disable the firewall with
netsh advfirewall set allprofiles state off)

VirtualBox_Windows 2012 Server Core_10_06_2016_16_59_99

Here it is. Easy to install and run Oracle Database on a Windows Server without all the overhead of those playful accessories. Yes, Microsoft Windows can be a real server. Is there any reason to run a database on something else than the Core installation?

 

Cet article Oracle on Windows Server Core est apparu en premier sur Blog dbi services.

Instance Caging and multitenant: do the right setting

$
0
0

When you want to do instance caging, you have to set manually CPU_COUNT and to set a resource manager plan. If you set only the CPU_COUNT no instance caging will occur. Except during the maintenance window where the maintenance plan is set internally. You don’t want that kind of unpredictable behavior, so the recommandation is to always set a resource plan when you set manually CPU_COUNT. Here is another reason for such an unpredictable behavior.

I’ve run 16 sessions running CPU. I’m in multitenant and they are connected to CDB$ROOT.

[oracle@CDB ~]$ jobs
[1] Running sqlplus / as sysdba <<< "exec loop null; end loop;" &
[2] Running sqlplus / as sysdba <<< "exec loop null; end loop;" &
[3] Running sqlplus / as sysdba <<< "exec loop null; end loop;" &
[4] Running sqlplus / as sysdba <<< "exec loop null; end loop;" &
[5] Running sqlplus / as sysdba <<< "exec loop null; end loop;" &
[6] Running sqlplus / as sysdba <<< "exec loop null; end loop;" &
[7] Running sqlplus / as sysdba <<< "exec loop null; end loop;" &
[8] Running sqlplus / as sysdba <<< "exec loop null; end loop;" &
[9] Running sqlplus / as sysdba <<< "exec loop null; end loop;" &
[10] Running sqlplus / as sysdba <<< "exec loop null; end loop;" &
[11] Running sqlplus / as sysdba <<< "exec loop null; end loop;" &
[12] Running sqlplus / as sysdba <<< "exec loop null; end loop;" &
[13] Running sqlplus / as sysdba <<< "exec loop null; end loop;" &
[14] Running sqlplus / as sysdba <<< "exec loop null; end loop;" &
[15]- Running sqlplus / as sysdba <<< "exec loop null; end loop;" &
[16]+ Running sqlplus / as sysdba <<< "exec loop null; end loop;" &

I’ve set CPU_COUNT to 8:

SQL> show spparameter cpu_count
 
SID NAME TYPE VALUE
-------- ----------------------------- ----------- ----------------------------
* cpu_count integer 8

but no resource manager plan:

SQL> show parameter resource_manager_plan
 
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
resource_manager_plan string

However, instance caging occurs:

SQL> connect / as sysdba
Connected.
SQL> select count(*),con_id,session_state,event from v$active_session_history
2 where sample_time>sysdate-1/24/60/60 group by con_id,session_state,event;
 
COUNT(*) CON_ID SESSION EVENT
---------- ---------- ------- ------------------------------
20 1 WAITING resmgr:cpu quantum
12 1 ON CPU

Here you can see on the left that CPU usage has been limited to 8 user processes.
CaptureRESMGR005

Any idea why instance caging occurred when there are no visible resource plan? And what did I do at 08:57 PM in order to stop instance caging?
Well, I did:

SQL> alter pluggable database pdb close;

Now you understand. A resource plan was set for the PDB:

SQL> alter session set container=PDB;
 
Session altered.
 
SQL> show con_id
 
CON_ID
------------------------------
3
SQL>
SQL> show spparameter cpu_count
 
SID NAME TYPE VALUE
-------- ----------------------------- ----------- ----------------------------
* cpu_count integer
SQL>
SQL> show parameter resource_manager_plan
 
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
resource_manager_plan string DEFAULT_PLAN

So be careful if you set CPU_COUNT manually, any settings that activates the Resource Manager. And setting a resource manager plan in a PDB activates the Resource manager for the instance as soon as the PDB is open (read/write or read only).

So the recommandation is: when you set CPU_COUNT in a CDB, then always set a Resource Manager Plan at CDB level. CDB_DEFAULT_PLAN is there for that:

(by the way, more information about resource manager and 12c in Maris Elsins presentation)

If you are in Standard Edition, or Enterprise Edition without the multitenant option, you have only one PDB per CDB. This means that you have probably multiple instances on one server and instance caging is of crucial importance there. Setting the resource_manager_plan to CDB_DEFAULT_PLAN is sufficient to activate instance caging in a single-tenant instance:

CaptureRESMGR006

With multitenant option, you may create a custom plan with multiple directives. Instance caging is important even when you have only one instance on the server because the database scheduler is more efficient than the OS one. But that’s probably for a future blog post.

 

Cet article Instance Caging and multitenant: do the right setting est apparu en premier sur Blog dbi services.

Attunity Replicate: additional column on the target

$
0
0

In December 2015, my colleague Vincent did a test with additional column with Oracle GoldenGate 12.2.
Hervé & Vincent asked me to do the same test with Attunity Replicate between Oracle and SQL Server.
For the story, Hervé does the test with a previous version here.

I create a task and select the Scott schema.

Attunity_Replicate_Col01

I run the replication

Attunity_Replicate_Col02

On the target SQL Server database, I create an additional Column:

alter table [SCOTT].[EMP] add SOURCE_COL varchar(10) default null;

Attunity_Replicate_Col03

Now, I do the same on the source Oracle database:

alter table SCOTT.EMP add SOURCE_COL varchar(10) default null;

Attunity_Replicate_Col04

I control in the monitor what’s append and see 1 error with the table SCOTT.EMP and in the log message, I see that the ALTER TABLE ADD SOURCE_COL don’t work!

Attunity_Replicate_Col06

It is logic, I have already the column in SQL Server… 8-)

I update my table EMP on the source:

update SCOTT.EMP set source_col='change';
commit;

Attunity_Replicate_Col05

Attunity Replicate replicates the update command to the destination

Attunity_Replicate_Col07

I have a look on the destination and see that the update is applied on the additional column.

Attunity_Replicate_Col08

Like GoldenGate and the version 12.2, the additional column is updated in Attunity Replicate. No Problemo! 8-)

 

Cet article Attunity Replicate: additional column on the target est apparu en premier sur Blog dbi services.

Large Pages on Windows

$
0
0

In a previous post I’ve installed Oracle on a Windows Server Core. Now I’ll enable large pages.

ORA_LPENABLE

To let Oracle use Large Pages on Windows, you need to define a registry string ORA_LPENABLE=1. This will enable large pages for all instance so it’s probably better to do it at instance level with ORA_SID_LPENABLE, especially if you have an ASM instance on your server. Note that large pages is a recommandation for an database server but if you have other applications that sue lot of memory you may lack of memory because of fragmentation.

Stopping the service:


C:\cygwin64\bin>net stop OracleServiceCDB
The OracleServiceCDB service is stopping.
The OracleServiceCDB service was stopped successfully.

Showing the registry:


C:\cygwin64\bin>reg query HKLM\Software\Oracle\Key_OraDB12Home1
 
HKEY_LOCAL_MACHINE\Software\Oracle\Key_OraDB12Home1
ORACLE_HOME REG_SZ C:\app\oracle\product\12.1.0\EE12101
ORACLE_HOME_NAME REG_SZ OraDB12Home1
ORACLE_GROUP_NAME REG_SZ Oracle - OraDB12Home1
NLS_LANG REG_SZ AMERICAN_AMERICA.WE8MSWIN1252
ORACLE_BUNDLE_NAME REG_SZ Enterprise
OLEDB REG_SZ C:\app\oracle\product\12.1.0\EE12101\oledb\mesg
ORACLE_HOME_TYPE REG_SZ 1
ORACLE_SVCUSER REG_SZ oracle
ORACLE_SVCUSER_PWDREQ REG_SZ 1
ORACLE_BASE REG_SZ C:\app\oracle
MSHELP_TOOLS REG_SZ C:\app\oracle\product\12.1.0\EE12101\MSHELP
ORACLE_HOME_KEY REG_SZ SOFTWARE\ORACLE\KEY_OraDB12Home1
SQLPATH REG_SZ C:\app\oracle\product\12.1.0\EE12101\dbs
RDBMS_CONTROL REG_SZ C:\app\oracle\product\12.1.0\EE12101\DATABASE
RDBMS_ARCHIVE REG_SZ C:\app\oracle\product\12.1.0\EE12101\DATABASE\ARCHIVE
ORA_CDB_AUTOSTART REG_EXPAND_SZ TRUE
ORA_CDB_SHUTDOWN REG_EXPAND_SZ TRUE
ORA_CDB_SHUTDOWNTYPE REG_EXPAND_SZ immediate
ORA_CDB_SHUTDOWN_TIMEOUT REG_EXPAND_SZ 90
ORACLE_SID REG_SZ CDB
ORA_LPENABLE REG_SZ 1
 
HKEY_LOCAL_MACHINE\Software\Oracle\Key_OraDB12Home1\ODE
HKEY_LOCAL_MACHINE\Software\Oracle\Key_OraDB12Home1\OLEDB

Starting the service:

C:\cygwin64\bin>net start OracleServiceCDB
The OracleServiceCDB service is starting...
The OracleServiceCDB service was started successfully.

and tailing the alert.log:

C:\cygwin64\bin>adrci exec="set home cdb ; show alert -tail 5"
2016-06-13 12:51:22.709000 -07:00
Archiving is disabled
2016-06-13 12:51:32.153000 -07:00
Instance shutdown complete
2016-06-13 12:51:38.179000 -07:00
Starting ORACLE instance (normal) (OS id: 1620)
CLI notifier numLatches:3 maxDescs:519
Large page enabled. Mode is : 1
Large page size : 2097152
Large page request size : 16777216
2016-06-13 13:00:38.599000 -07:00
Starting ORACLE instance (normal) (OS id: 1760)
CLI notifier numLatches:3 maxDescs:519
Large page enabled. Mode is : 1
Large page size : 2097152
Large page request size : 16777216
2016-06-13 13:01:37.707000 -07:00
Starting ORACLE instance (normal) (OS id: 1888)
CLI notifier numLatches:3 maxDescs:519
Large page enabled. Mode is : 1
Large page size : 2097152
Large page request size : 16777216

You can see my registry entries on the right and the alert.log on the left.

CaptureWCLP001

But this is not sufficient to start the instance:

CaptureWCLP003

Lock pages in memory

In 12c you can, and you should, run the Oracle Database instance as another user than the system administrator.
Large pages must be locked in physical memory (and that’s one good reason to use them) but by default non administrators do not have this privilege.

You have to allow it with the Local Goup Policy Editor. Unfortunately, doing that in a Windows Server Core seems to be very difficult because gpedit-msc is not there.

At that point, my enthousiasm for Windows Core ended and I installed Windows Server GUI.

VirtualBox_Windows 2012 Server_14_06_2016_22_45_05

Now I can startup. Here is the alert.log:


Starting ORACLE instance (normal) (OS id: 2620)
Wed Jun 15 09:53:12 2016
CLI notifier numLatches:7 maxDescs:519
Wed Jun 15 09:53:12 2016
Large page enabled. Mode is : 1
Wed Jun 15 09:53:12 2016
Large page size : 2097152
Large page request size : 16777216
Wed Jun 15 09:53:12 2016
Allocated Large Pages memory of size : 14680064
Wed Jun 15 09:53:12 2016
Allocated Large Pages memory of size : 1660944384
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Initial number of CPU is 4
Number of processor cores in the system is 4
Number of processor sockets in the system is 1
Picked latch-free SCN scheme 3
Using LOG_ARCHIVE_DEST_1 parameter default value as USE_DB_RECOVERY_FILE_DEST
Autotune of undo retention is turned on.
IMODE=BR
ILAT =51
LICENSE_MAX_USERS = 0
SYS auditing is enabled
NOTE: remote asm mode is local (mode 0x1; from cluster type)
NOTE: Using default ASM root directory ASM
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options.
Windows NT Version V6.2
CPU : 4 - type 8664, 4 Physical Cores
Process Affinity : 0x0x0000000000000000
Memory (Avail/Total): Ph:5718M/8191M, Ph+PgF:7735M/10111M

And yes, I am in Automatic Memory Management (AMM) where size of PGA and SGA is dynamically resized by Oracle within MEMORY_TARGET:


Using parameter settings in server-side spfile C:\APP\ORACLE\PRODUCT\12.1.0\DBHOME_1\DATABASE\SPFILECDB.ORA
System parameters with non-default values:
processes = 300
use_large_pages = "ONLY"
memory_target = 1600M
control_files = "C:\APP\ORACLE\ORADATA\CDB\CONTROL01.CTL"
control_files = "C:\APP\ORACLE\FAST_RECOVERY_AREA\CDB\CONTROL02.CTL"
db_block_size = 8192
compatible = "12.1.0.2.0"
db_recovery_file_dest = "C:\app\oracle\fast_recovery_area"
db_recovery_file_dest_size= 500M
_catalog_foreign_restore = FALSE
undo_tablespace = "UNDOTBS1"
remote_login_passwordfile= "EXCLUSIVE"
db_domain = ""
dispatchers = "(PROTOCOL=TCP) (SERVICE=CDBXDB)"
audit_file_dest = "C:\APP\ORACLE\ADMIN\CDB\ADUMP"
audit_trail = "DB"
db_name = "CDB"
open_cursors = 300
diagnostic_dest = "C:\APP\ORACLE"
enable_pluggable_database= TRUE

On Windows, all Oracle “processes” run as threads of the same Windows process, I don’t see any reason to advise against AMM there.
Note that I don’t see any reason to recommend it either…

 

Cet article Large Pages on Windows est apparu en premier sur Blog dbi services.

CPU_COUNT

$
0
0

When you have less CPU threads than the number of processes that has something to run in CPU, the OS will schedule them to share the CPU resource. Increasing the workload at that point will not increase the throughput because you have reached the capacity of your system, and response time will increase because of queuing. Actually, performance will even decrease because of the overhead of context switching when trying to share the processors.
When you don’t want the OS scheduler to do the resource sharing job, you can, and should, use Instance Caging. For sure, the database instance can do resource sharing more intelligently than the OS as it knows the kind of workload and the performance requirement of each process.

I did some tests on a 8 CPU machine running SLOB from 32 concurrent sessions, then 31, then 30,… down to the last run with 1 sessions, each for 5 minutes. This is what you see on the right-most dark green triangle here:
CaptureSLOBCPUCOUNT01
After a very short library cache contention when all 32 sessions are parsing their statements. The each run go decreasing. The dark green here is labelled as ‘CPU + CPU wait’ and is coming from ASH where all sessions are on state ‘ON CPU’ even when they are actually in the OS runqueue. Of course, I’ve only 8 CPU threads, so I cannot have 32 sessions running on CPU.

The runs on the left where you can see the same but with some light green is from same runs but with Instance Caging active. I’ve a resource manager plan set and I’ve set CPU_COUT to 8 (the first run on the left), then 7, … down to 1. The dark green is still the ‘ON CPU’ state and with Instance Caging Oracle allows at maximum CPU_COUNT processes in that state. The remaining processes are switched to a waiting state, instrumented as ‘resmgr: cpu quantum’ and displayed in light green.

My goal is to show that you can increase the throughput with Instance Caging. I measured the logical reads per second and made an Excel chart from them. The blue lines are from different CPU_COUNT settings from 8 to 1. The orange line is from no setting CPU_COUNT which means that instance caging is not enabled. On the X axes you have the number of conccurent SLOB sessions I’ve run. What you see from the bluse lines is that the throughput increases linearly with the number of concurrent session until it reaches the limit: either the CPU_COUNT limit or the physical limit when CPU_COUNT is not set. Note that the CPU threads are not cores here. Tests were done on Oracle Public Cloud 4 OCPUs (aka OC5 compute shape) which are actually 8 threads from E5-2690 v2 Intel processors. This is why running on two threads here do not double the throughput. Actually, when running 8 sessions on 8 threads the throughput is only x6 from running one session on one thread.

CaptureSLOBCPUCOUNT

The second goal is to compare Oracle instance caging with OS scheduler when instance is using full capacity of the server. On the top you can see the darker blue line which is when CPU_COUT is set to the actual number of CPU threads (CPU_COUNT=8). The orange line is when no CPU_COUNT is set: instance caging is disabled. The maximum throughput then, 3.6 MLR/s, is reached when we run same number of sessions as the number of CPU threads. What you see here is that when the server is overloaded scheduling at instance level is more efficient than scheduling at OS level. Without instance caging, the orange line, the LR/s degrades because of context switching overhead. So the recommandation here is to always do instance caging even if you have only one instance on your server.

Why is the instance caging algorithm better than the OS scheduler? Because it is focused at database processes workload. Here is the graphs of the ‘resmgr: cpu quantum’ wait times.

CaptureSLOBCPUCOUNT02

On the left, I’ve run with CPU_COUNT=8. When I have 32 concurrent sessions each of them spend 3/4 of their time waiting for CPU. Those waits are about 300 milliseconds. When I’ve only 9 sessions, each one have to spend only small part of their response time on waiting. They wait about 25 milliseconds on ‘resmgr: cpu quantum’. The wait time is not fixed and depends on the load. This makes sens: when you know you will have to spend a long time waiting, it’s better to have longer waits in order to avoid too many context switches. On the right, it’s the same but with CPU_COUNT=1 which gives x8 less CPU time to the processes. They will have to spend more time on waiting. And we see that the wait time is adjusted: can go up to 4 seconds time slices. The OS scheduler will never do that, putting a process on runqueue wait for several seconds, because the scheduler tries focus on the response time. It’s different with instance caging. When you know that you will have to spend a long time waiting, then it’s better to optimize throughput by lowering the context switching.

The recommandation is to enable instance caging: set a resource manager plan and set cpu_count. It’s not an option. There’s no additional costs for it. And it will always be better than letting the OS manager CPU starvation.

On Standard Edition 2, it’s even easier: Oracle Corp. enabled instance caging for you ;)

 

Cet article CPU_COUNT est apparu en premier sur Blog dbi services.


ASM iostats

$
0
0

A few screenshots and a link here. Sysadmins do not like ASM because they don’t have the tools they like to manage the disks. For example, they don’t want to run SQL queries to check performance, and asmcmd iostat is quite limited. Here is a nice way to get I/O statistics easily from command line.

The perl script is from Bertrand Drouvot (do not miss his twitter profile picture) and is easily downloadable from his blog:
https://bdrouvot.wordpress.com/2013/10/04/asm-metrics-are-a-gold-mine-welcome-to-asm_metrics-pl-a-new-utility-to-extract-and-to-manipulate-them-in-real-time/

It’s only queries on ASM instance, so no risk.

I order to show the relevance, I took screenshots from this script and the XtremIO console from a system where all ASM disks, and only them, are on the XtremIO brick so you can compare statistics from the storage array and from the ASM instance.

Bandwidth

ASMIOScreenshot 2016-06-20 15.18.55

IOPS

ASMIOScreenshot 2016-06-20 15.19.08

Latency

ASMIOScreenshot 2016-06-20 15.19.17

 

Cet article ASM iostats est apparu en premier sur Blog dbi services.

When changing CURSOR_SHARING takes effect?

$
0
0

I usually don’t advice to set CURSOR_SHARING=FORCE but imagine your application requires it, you forgot it (or tried to do without) on migration and then everything is slow. You want to change it, but when does it take effect? New execution? New parsing? New session?

EXACT

I have the default value where parent cursor is shared only when sql_text is the same:

SQL> show parameter cursor_sharing
 
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
cursor_sharing string EXACT

And I check with a query that the predicate is not changed:

SQL> select * from dual where dummy='X';
 
D
-
X
 
SQL> select * from table(dbms_xplan.display_cursor) where plan_table_output like '%filter%';
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
1 - filter("DUMMY"='X')

FORCE

I change at system (=instance) level

SQL> alter system set cursor_sharing=force;
System altered.
 
SQL> show parameter cursor_sharing
 
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
cursor_sharing string FORCE

I tested without session cached cursors:

SQL> alter session set session_cached_cursors=0;
Session altered.

and even from another session

SQL> connect / as sysdba
Connected.

But the predicate still has its predicate:

SQL> select * from dual where dummy='X';
 
D
-
X
 
SQL> select * from table(dbms_xplan.display_cursor) where plan_table_output like '%filter%';
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
1 - filter("DUMMY"='X')

No invalidation, no new cursor. Same old statement.

FLUSH SHARED_POOL

Only when I flush the shared_pool I can execute the statement with literals replaced:

SQL> alter system flush shared_pool;
System altered.
 
SQL> select * from dual where dummy='X';
 
D
-
X
 
SQL> select * from table(dbms_xplan.display_cursor) where plan_table_output like '%filter%';
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
1 - filter("DUMMY"=:SYS_B_0)

If you fear a hard parse fest, you can flush specific cursors. I’ve documented the procedure in a previous post.

Autotrace

As a side note, do not rely on autotrace for that

SQL> set autotrace on explain
SQL> select * from dual where dummy='X';
 
D
-
X
 
Execution Plan
----------------------------------------------------------
Plan hash value: 272002086
 
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2 | 2 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| DUAL | 1 | 2 | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
1 - filter("DUMMY"='X')

Just one more thing that is special with autotrace…

Conclusion

I don’t know exactly how cursor_sharing=force is managed. I thought that the literal replacement occurred before searching for parent cursor. Don’t hesitate to comment here if you know the ‘why’ behind this behavior. My goal here was just to test what has to be done in order to have immediate effect of cursor_sharing change.

 

Cet article When changing CURSOR_SHARING takes effect? est apparu en premier sur Blog dbi services.

ORA-01775: looping chain of synonyms

$
0
0

This error message is misleading. You may encounter it when you expect ORA-00942: table or view does not exist. Let’s explain

I’m connected as SCOTT and create a PUBLIC SYNONYM for an object that do not exists:

SQL> create public synonym MONEY for NOTHING;
Synonym created.

No error message.
Only when I read it I have an error message telling me that there are no table or view behind it:

SQL> select * from NOTHING;
select * from NOTHING
*
ERROR at line 1:
ORA-00942: table or view does not exist

Let’s do the same but call it BONUS instead of MONEY:

SQL> create public synonym BONUS for NOTHING;
Synonym created.
 
SQL> select * from BONUS;
no rows selected

No error here. Why? because I’ve a table that is called BONUS. So the name is resolved with the table and the synonym is not even tried.

I’ll now drop that synonym and create it for the table BONUS. Same name for the public synonym and for the table.

SQL> drop public synonym BONUS;
Synonym dropped.
 
SQL> create public synonym BONUS for BONUS;
Synonym created.

As user SCOTT, when I query BONUS the name is resolved as the table:

SQL> show user
USER is "SCOTT"
SQL> select * from BONUS;
no rows selected

As another user, when I query BONUS the name is resolved as the synonym, which finally reads SCOTT.BONUS:

SQL> show user
USER is "SCOTT"
SQL> select * from BONUS;
no rows selected

In 12c it is easy to see the final query:

SQL> variable c clob
SQL> exec dbms_utility.expand_sql_text('select * from BONUS',:c);
PL/SQL procedure successfully completed.
 
SQL> print c
 
C
----------------------------------------------------------------------------------------------------------
SELECT "A1"."ENAME" "ENAME","A1"."JOB" "JOB","A1"."SAL" "SAL","A1"."COMM" "COMM" FROM "SCOTT"."BONUS" "A1"

But now, what happens when we drop the table?

SQL> drop table SCOTT.BONUS;
Table dropped.

Do you expect a ORA-00942: table or view does not exist?

SQL> select * from BONUS;
select * from BONUS
*
ERROR at line 1:
ORA-01775: looping chain of synonyms

Here is the ‘looping chain of synonyms’. I ask for BONUS. The name resolution first check for an object in my schema, but there are none:

SQL> select object_type from user_objects where object_name='BONUS';
no rows selected

Then it looks for public synonym and there is one:

SQL> select object_type from all_objects where owner='PUBLIC' and object_name='BONUS';
 
OBJECT_TYPE
-----------------------
SYNONYM

So we check what it is a synonym for:

SQL> select table_owner,table_name from all_synonyms where owner='PUBLIC' and synonym_name='BONUS';
 
TABLE_OWNER TABLE_NAME
------------ ------------
SCOTT BONUS

And there it is interesting. Besides the column names that includes ‘TABLE’ a synonym can reference any object. So it’s not just replacing the synonym with ‘SCOTT.BONUS’ which would raise an ORA-00942. It is doing name resolution of BONUS in the context of the user SCOTT. Something similar to:

SQL> alter session set current_schema=SCOTT;
Session altered.
SQL> select * from BONUS;

And then, what do you expect from that? There is no table named BONUS but there is a public synonym… and you’re back to the begining:

select * from BONUS
*
ERROR at line 1:
ORA-01775: looping chain of synonyms

Most of the time, you don’t have synonyms that reference other synonyms, so you don’t really have a ‘chain’ of synonyms. Except when there is only synonym in the namespace and it’s a self-reference loop. So if you see ORA-01775, check if the referenced object is not missing.

 

Cet article ORA-01775: looping chain of synonyms est apparu en premier sur Blog dbi services.

ODA X6-2S and ODA X6-2M for EE and SE2

$
0
0

After the announcement of the death of SE1 and SE we wondered what Oracle will do for Small and Medium Enterprises and entry level products. The answer was postgres the Oracle Cloud Services, but that’s for dev/test only because Public Cloud is not for production and SME will not build a Private Cloud for their few Oracle databases.

The answer was announced to partners a few week ago and is now official: the 5th generation of ODA, the X6, has now an entry level version for Standard Edition 2.

ODA, until today, was only for Enterprise Edition. You could install Standard Edition on a Guest VM (described here) but with poor I/O performance and no advantages of ODA.

The smallest ODA was a 2 nodes cluster. The hardware is not expensive for what it is, but it is still expensive when you want only one small server.
Lot of our customers asked for a small ODA with Standard Edition. Even big customers with only a small database. They consolidate everything on VMWare, but for licencing reason need to isolate Oracle on physical machine. Two ODAs (as they want a DR site) is too large for their needs.

Good news, there are now smaller, one node, ODAs that can run Standard Edition and fully automated, even more automated than the previous ODAs.
New products but same values:

  • simplicity with automated deployment,
  • automated patching,
  • zero-admin storage,
  • integrated VM management,
  • performance (new NVMe to access flash storage)

The new products are:

ODA X6-2S: the entry level for SE2

‘Only one single database’ but this is not a limit, just a recommandation related to the capacity.
Virtualized (can run application server on it)
1 socket with 10 cores Xeon, 126GB RAM, up to 384 GB, 10Gbase-T for public network and/or 10 GbE SFP+ Public Network
All Flash Storage. 6.4 TB NVMe Flash Storage, up to 12.3 -> usable for database is 2.4 TB up to 4.8
Cost: 18000$ which is the same as a Dell equivalent server (Oracle says it’s 2x cheaper but we may not have the same price list)

ODA X6-2M: similar but ‘multiple databases’

More resource: (2x sockets, 2x memory)
Same storage as X6-2S but NFS is also supported for additional storage
cost: 24000$

Both can be used for Standard Edition 2 (licensed in socket or NUP) or Enterprise Edition (Licences with Capacity On Demand activated cores or NUP)

ODA HA

This is the X5-2 we already know with 2 servers (X6 is planned for end of year)
2 nodes there, for RAC, RON, or single instance consolidation. It can be Bare Metal or Virtualized

Just my guess, but there are good chances that only Multitenant is supported for them in 12.2 – single-tenant when in SE2 of course – so that it is easy to move database to and from the public cloud services. Actually, the goal is:

You run dev and test on Oracle Public Cloud and your production on-premises. That’s a full Oracle solution where you can move your (pluggable) databases between the two platforms.

The ODA automation has evolved. It was easy configuration and patching. It is now also easy provisioning with the same interface that cou can found in Enterprise Manager or the Cloud. The Appliance Manager is accessible though web console or command line and helps to automate deployment, management, support and monitoring of the ODA.

 

Cet article ODA X6-2S and ODA X6-2M for EE and SE2 est apparu en premier sur Blog dbi services.

Question is: upgrade now to 12.1.0.2 or wait for 12.2 ?

$
0
0

Let’s look at Release Schedule of Current Database Releases (Doc ID 742060.1)
12.2.0.1 is planned for 2HCY2016 on platforms Linux x86-64, Oracle Solaris SPARC (64-bit), Oracle Solaris x86-64 (64-bit).
2HCY2016 starts next week but we can imagine that it will not be released immediately and anyway we will have to wait a few months to download the on-premise(s) version. Add another couple of months to get at least one Proactive Bundle Patch to stabilize that new release. So maybe we can plan for production upgrade on Jan. 2017 for Linux platform, and Apr. or Jul. 2017 for Windows platform, right? How does that cope with 11.2.0.4 and 12.1.0.1 end of support?

Is delay for 12.2 a problem?

My opinion is that long time for new release is not a problem. Most of customers want stable supported release, not new features available only with options and that may introduce bugs. As long as we have support, PSUs and Proactive Bundle patchsets, everything is ok. We can’t blame software regressions after upgrade, and at the same time look forward to get new releases in a short period of time.
So in my opinion, waiting 6 months or 1 year to get 12.2 is not a problem except for book authors that wait for the general availability of 12.2 to release their book https://www.amazon.com/Oracle-Database-Release-Multitenant-Press/dp/1259836096 ;)

Is ‘cloud first’ a problem?

I don’t think that ‘cloud first’ is a problem by itself. We will have to learn 12.2 features and test them before upgrading our databases, and the Oracle Public Cloud is good for that. But I fear that customers will feel forced to go to the cloud, which is wrong. Was the same when 12.1.0.2 was released for Enterprise Edition. They feel forced to qui Standard Edition but that was probably not the goal. Especially when those that have quit Standard Edition One did it to go to open-source RDBMS.

Is ‘multitenant first’ a problem?

Yes, ‘cloud first’ may mean ‘multitenant first’ because that’s the only architecture available for 12c on the Oracle DBaaS. First, you can install a non-CDB if you choose ‘virtual image’. And anyway, OPC trial is the good occasion to test 12.2 and multitenant at the same time. Let me repeat that multitenant architecture has lot of features available without the multitenant option.

Upgrade planning

Back to the ground, the problem in my opinion is the incertitude.
Free extended support for 11.2.0.4 ends on 31-May-2017 and we don’t know yet if we will have a stable (i.e with few PSUs) 12.2 release at that time for on-premises, especially for Windows which will come later than Linux.
Remember that 12.1.0.2 on Windows came two months after the Linux one. And another two months for AIX.

12.1.0.1 support ends on 31-Aug-2016 and 12.2 will not be out at that time, at least for on-premises.

So what?

Customers that expected to get 12.2 before the end of 12.1.0.1 or 11.2.0.4 support will now (since the announcement of 2HCY2016 last month and the ‘cloud first’ recent announcement) have to plan an intermediate upgrade to 12.1.0.2 before going to 12.2. And because of the ‘Release 1′ myth, they are afraid of that. Our mission, as consultants and Oracle partners, is to explain that the myth has no reason behind it. Look at Mike Dietrich blog about that. Hope you will be convinced that version, releases and patchsets can bring regressions and should be carefully tested, whatever it’s the 1st, 2nd or 4th number on the version identification that is incremented. New ORACLE_HOME is new software.

Then, once in 12.1.0.2 you will have the time to plan an upgrade to 12.2 after learning, testing, changing administration scripts/procedures/habits to the era of multitenant. And you will be ready for the future.

The customers in 11.2.0.4 that do not want to plan that intermediate upgrade will have the option to pay for extended support which ends on 31-DEC-2019.

 

Cet article Question is: upgrade now to 12.1.0.2 or wait for 12.2 ? est apparu en premier sur Blog dbi services.

Viewing all 462 articles
Browse latest View live