Thursday, September 22, 2016

ODI Decrypt password

Below java code can be used to decrypt the passwords stored in Oracle Data Integrator. 



 import com.sunopsis.dwg.DwgObject;  
 public class OdiDecrypt {  
   public OdiDecrypt() {  
     super();  
   }  
   @SuppressWarnings("deprecation")  
   public static void main(String[] args) {  
     OdiDecrypt odiDecrypt = new OdiDecrypt();  
     @SuppressWarnings("deprecation")  
     String strMasterPassEnc="gHyXQ5WaJua6RFCRmP1l";  
     String strMasterPass=DwgObject.snpsDecypher(strMasterPassEnc);  
     System.out.println(strMasterPass);  
   }  
 }  





Required .jars:

apache-commons-lang.jar
odi-core.jar which is available at $ORACLE_HOME\oracledi.common\odi\lib\

ODI Decrypt password

Below java code can be used to decrypt the passwords stored in Oracle Data Integrator. 



 import com.sunopsis.dwg.DwgObject;  
 public class OdiDecrypt {  
   public OdiDecrypt() {  
     super();  
   }  
   @SuppressWarnings("deprecation")  
   public static void main(String[] args) {  
     OdiDecrypt odiDecrypt = new OdiDecrypt();  
     @SuppressWarnings("deprecation")  
     String strMasterPassEnc="gHyXQ5WaJua6RFCRmP1l";  
     String strMasterPass=DwgObject.snpsDecypher(strMasterPassEnc);  
     System.out.println(strMasterPass);  
   }  
 }  





Required .jars:

apache-commons-lang.jar
odi-core.jar which is available at $ORACLE_HOME\oracledi.common\odi\lib\

Thursday, September 15, 2016

ODI 12C Studio installation

Oracle Data Integrator Studio is a developer's interface for configuring and managing ODI. To install ODI download the latest files Disk 1 & Disk 2 from OTN and extract to a local folder.



Make sure you have Java 8 installed on your system.If you have a older version of Java and would like to update to newer one use the below commands from the command prompt.

Click Start, type cmd. When the cmd.exe icon appears, right click and select Run as administrator.To add/update system environment variables permanently:

setx -m JAVA_HOME "C:\Program Files\Java\jdk1.8.0"
setx -m PATH "%PATH%;%JAVA_HOME%\bin";

Open command prompt as administrator and execute the jar file as below



















Select Standalone Installation
















Click Finish to complete installation.

Open the studio from the windows start menu.






Connecting to the Master Repository


If you have installed any previous version of Oracle Data Integrator on the same computer you are currently using, you may be asked whether or not you want to import preferences and settings from those previous installations into ODI Studio. The tasks and descriptions in this section assume that no previous versions of Oracle Data Integrator exist on your computer.

To connect to the master repository:
  1. From the ODI Studio menu, select File, then select New.
    On the New gallery screen, select Create a New ODI Repository Login, then click OK.
  2. On the Oracle Data Integrator Login screen, click the plus sign (+) icon to create a new login. On the Repository Connection Information screen:
    • Oracle Data Integrator Connection section:
      • Login Name: Specify a custom login name.
      • User: Specify SUPERVISOR (all CAPS).
      • Password: Specify the supervisor password of RCU Custom Variable screen.
    • Database Connection (Master Repository) section
      • User: Specify the schema user name for the Master repository. This should be prefix_ODI_REPO as specified on the Select Components screen in RCU.
      • Password: Specify the schema password for the Master repository. This was specified on the Schema Passwords screen in RCU.
      • Driver List: Select the appropriate driver for your database from the drop-down list.
      • URL: Specify the connection URL. Click on the magnifying glass icon for more information about the connection details for your driver.
    • In the Work Repository section, select  Master Repository Only.




  1. Click Test to test the connection, and fix any errors. After the test is successful, click OK to create the connection.
  2. Specify and confirm a new wallet password on the New Wallet Password screen.
  3. After you have successfully created a new login, you are returned to ODI Studio.
    Select Connect to Repository and, when prompted, provide your new wallet password.
    After providing your wallet password, the Oracle Data Integrator Login screen appears. Provide the following information to log in:
    1. In the drop-down menu in the Login Name field, select the name of the new login you just created.
    2. Specify SUPERVISOR as the user name.
    3. Provide the password for the Supervisor user.

Sunday, August 7, 2016

MapReduce FileAlreadyExistsException - Output file already exists in HDFS

The below exception is because your output directory is already existing in the HDFS file system. 


 Exception in thread "main" org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory file:/C:/HadoopWS/outfile already exists  
     at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:146)  
     at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:266)  
     at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:139)  


You have to delete the output directory after running the job once. This can be done on command line using the below script:

$ hdfs dfs –rm -r /pathToDirectory

If you would like it to do through the java code below code snippet can be used. This will delete the output folder before running the job everytime.

Path output =new Path(outPath);
FileSystem hdfs = FileSystem.get(conf);
        if (hdfs.exists(output)) {
            hdfs.delete(output, true);
}



Another workaround would be to pass the output directory through command line as below.

$ yarn jar {name_of_the_jar_file.jar} {package_name_of_jar} {hdfs_file_path_on_which_you_want_to_perform_map_reduce} {output_directory_path}

If you would like to create a new directory everytime below code can be used.


 String timeStamp = new SimpleDateFormat("yyyy.MM.dd.HH.mm.ss", Locale.US).format(new Timestamp(System.currentTimeMillis()));  
 FileOutputFormat.setOutputPath(job, new Path(“/MyDir” + "/" + timeStamp));  

Tuesday, August 2, 2016

Hadoop - Find The Largest Top 10 Directories in HDFS

Sometimes it is necessary to know what file(s) or directories are eating up all your disk space.Below scripts will help you to use Unix and Linux command for finding the largest or biggest the files or directories on HDFS.


 echo -e "calculating the size to determine top 10 directories on HDFS......"  
 for dir in `hadoop fs -ls /|awk '{print $8}'`;do hadoop fs -du $dir/* 2>/dev/null;done|sort -nk1|tail -10 > /tmp/size.txt  
 echo "| ---------------------------     | -------    | ------------ | ---------   | ----------   ------ |" > /tmp/tmp  
 echo "| Dir_on_HDFS | Size_in_MB | User | Group | Last_modified Time |" >> /tmp/tmp  
 echo "| ---------------------------     | -------    | ------------ | ---------   | ----------   ------ |" >> /tmp/tmp  
 while read line;  
 do  
     size=`echo $line|cut -d' ' -f1`  
     size_mb=$(( $size/1048576 ))  
     path=`echo $line|cut -d' ' -f2`  #(Use -f3 if running on cloudera)  
     dirname=`echo $path|rev|cut -d'/' -f1|rev`  
     parent_dir=`echo $path|rev|cut -d'/' -f2-|rev`  
     fs_out=`hadoop fs -ls $parent_dir|grep -w $dirname`  
     user=`echo $fs_out|grep $dirname|awk '{print $3}'`  
     group=`echo $fs_out|grep $dirname|awk '{print $4}'`  
     last_mod=`echo $fs_out|grep $dirname|awk '{print $6,$7}'`  
     echo "| $path | $size_mb | $user | $group | $last_mod |" >> /tmp/tmp  
 done < /tmp/size.txt  
 cat /tmp/tmp | column -t