Outer Joins and Effdt June 26, 2007
Posted by Duncan in Oracle, PeopleSoft, PeopleTools, SQL.trackback
I was asked by one of our Junior Consultants the best method of outer joining to a table containing Effdt (which you’d want to do a Max Effdt sub-query on).
I provided a solution, but after pondering the issue further I realised there are a number of ways to do this.
For those not familiar with the problem, an Outer Join is a specific SQL statement where all rows from the first table are returned even if no corresponding row in the second table is found. Values from the second table are shown where they exist, defaulting to null when they do not. The issue arises as the second table has an effdt column, which usually necessitates a sub query (to find the row with the greatest effdt less than a supplied as of date). This would be a sub-query, and you can’t outer-join to a subquery.
So, which is the best solution?
NVL both sides of the Subquery
The answer I gave was to apply a NVL to both sides of the sub-query thus:
and NVL(a.effdt,'01/JAN/1901') = (select NVL(max(a1.effdt),'01/JAN/1901') from ....
This worked fine in this example, but I’m wondering if there is a more elegant solution.
Normal Effdt plus Effdt is Null
An alternative would be to have the Effdt sub-query as normal (which would return any rows having an Effdt matches the sub-query Effdt) and then adding
or Effdt is null
This is a solution I’ve seen used the most, but to my mind it would omit data which has a row in the second table – but where the Effdt is after the AsOfDate. Therefore it wouldn’t be returned.
Inline View
Possibly the solution involving the most lines of code is creating an inline view. The Inline view would contain the second table and the subquery, in effect ‘flattening’ them. The first table can have an outer join to this Inline View without having to worry about the subquery.
It would look something like this:
SELECT PD.EMPLID
, PD.NAME
, J.LOCATION
FROM PS_PERSONAL_DATA PD
, (SELECT *
FROM PS_JOB J
WHERE J.EFFDT = (SELECT MAX(J1.EFFDT)
FROM PS_JOB J1
WHERE J1.EMPLID = J.EMPLID
AND J1.EMPL_RCD = J.EMPL_RCD
AND J1.EFFDT <= SYSDATE)) J
WHERE PD.EMPLID = J.EMPLID (+)
Does anyone have a better suggestion or comment on the above methods?
Comments
Sorry comments are closed for this entry
In your OR scenario above, you said, “…it would omit data which has a row in the second table – but where the Effdt is after the AsOfDate.” Isn’t that the point? Actually, it isn”t the NULL that will cause this, it is the supplied AsOfDate (SYSDATE in your example above).
Try using IN rather than OR: A.EFFDT IN (NULL, (SELECT MAX(A_EFF.EFFDT) FROM…)). Unfortunately, I don’t have a PeopleSoft database in front of me to test this. Like you, I have tried several alternatives. The problem with the OR is that it causes Oracle to run 2 queries, one for each case in the OR.
It is good to have multiple alternatives. The alternative that is best for one query may not be the best for another query.
Reviewing the post, I perhaps could have explained what I was trying to do a little better. In 99% of cases if you have an Effdt clause and the data only has a future dated row then you would want the data to be excluded. This situation was a little different however.
I had a record (in the example it was Personal Data) which had some employee data, and I’d outer joined to Job data. If the only row in job was a future dated row, I still wanted to retrieve the Personal Data row and have nulls for the Job data.
From my testing I discovered that the first two methods omit the row altogether – as though it doesn’t exist in Personal Data either – which is clearly not what we want. Only the Inline View returns the ‘correct’ result set.
I can’t seem to get the results I was getting when I wrote the post on my SQL Server VM. I’ll try it on as Oracle environment tomorrow and report back.
Ok, I’ve had a look and it seems that using
‘J.EFFDT in(null, (SELECT MAX(J1.EFFDT) …’
gives the same results as all of the other queries apart from the inline view, i.e. if there is only a future dated row in the outer joined to table it doesn’t show the row from the first table either.
So – for this specific set of circumstances – it seems only the inline view method gives the correct results.
Can anyone improve on this?
I would have thought the following would do it:
select PD.EMPLID, PD.NAME, J.LOCATION
from PS_PERSONAL_DATA PD
left join PS_JOB J on J.EMPLID = PD.EMPLID
and J.EFFDT = (select max(J1.EFFDT from PS_JOB J1 where…)
You’re right, it does.
I’d avoided trying that as I know you have to be careful putting ‘non-join’ conditions in the from clause. Have a look at:
http://www.oreillynet.com/pub/a/network/2002/10/01/whatsinacondition.html
I had a requirement where the supervisor id and supervisor name should be shown in the PS Query, the issue was the employees who doesnt have supervisor id are not shown in the result. I tried to do outer join using all the methods mentioned above. Nothing worked. At the end I had to do the UNION of the employees who doesnt have supervisor id (with a dummy column for a supervisor name) and the previous result….and at the end I achieved the outer join. But it takes more time…
Outer Joins Not Working…
If you are having problem getting outer joins to work, it may be due to nested effective date logic. Here is a link to blog post that helped me.
…