Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

...

NOTE

...

:

...

This

...

is

...

just

...

a

...

proposal.

...

There

...

is

...

no

...

guarantee

...

that

...

any

...

of

...

this

...

will

...

ever

...

end

...

up

...

in

...

the

...

actual

...

codebase,

...

I

...

just

...

felt

...

it

...

was

...

worth

...

experimenting.

...

-

...

JR

Update (09-05-2007)

...

:

...

I've

...

made

...

this

...

work

...

successfully

...

for

...

Collection

...

s,

...

Item

...

s,

...

and

...

Bundle

...

s.

...

The

...

performance

...

improvements

...

aren't

...

fully

...

implemented,

...

but

...

the

...

separation

...

is

...

there,

...

and

...

in

...

theory,

...

that

...

was

...

the

...

hard

...

part.

...

The

...

ArchiveManager

...

seems

...

to

...

be

...

working

...

pretty

...

well

...

too.

...

-

...

JR

...

Update

...

(11-05-2007)

...

:

...

I've

...

implemented

...

DAOs

...

for

...

the

...

Community

...

class

...

as

...

well.

...

The

...

ArchiveManager

...

now

...

supports

...

moving

...

Item

...

s,

...

Collection_,

...

and

...

_

...

Communities

...

between

...

containers.

...

-

...

JR

...

Update

...

(23-05-2007)

...

:

...

I've

...

totally

...

reimplemented

...

persistent

...

identifiers

...

in

...

DSpace

...

as

...

well

...

(see

...

PersistentIdentifiers ). As well as removing the Handle System dependency, they also use DAOs. -JR

Update (20-06-2007)

...

:

...

After

...

a

...

bit

...

of

...

a

...

hiatus

...

while

...

I

...

PersistentIdentifiers

...

I've

...

come

...

back

...

to

...

DAOs,

...

and

...

I've

...

now

...

(mostly)

...

got

...

them

...

in

...

place

...

for

...

Bitstream

...

s

...

as

...

well.

...

The

...

two

...

major

...

classes

...

that

...

still

...

need

...

doing

...

are

...

EPerson

...

and

...

Group_;

...

once

...

they're

...

done,

...

there

...

are

...

a

...

few

...

others

...

(eg:

...

_

...

SupervisedItem_,

...

_

...

WorkspaceItem_,

...

etc)

...

but

...

they

...

should

...

be

...

relatively

...

simple.

...

_

...

-

...

JR

...

Just

...

adding

...

a

...

comment

...

that

...

Handle/Pid

...

management

...

could

...

be

...

greatly

...

improved

...

by

...

such

...

an

...

addition

...

as

...

well.

...

currently

...

with

...

item

...

caching,

...

the

...

DSpaceObject.getHandle

...

method

...

can

...

become

...

stale

...

and

...

using

...

DAO's

...

behind

...

the

...

scene

...

for

...

the

...

HandleManagement

...

might

...

be

...

beneficial

...

-

...

-

...

Mark

...

Diggory

...

13:56,

...

10

...

May

...

2007

...

(EDT)

...

Update

...

(14-08-2007)

...

:

...

Everything

...

(apart

...

from

...

the

...

code

...

in

...

org.dspace.checker_)

...

has

...

been

...

pushed

...

through

...

the

...

DAO

...

layer.

...

Non-DAO

...

classes

...

no

...

longer

...

_

...

import

...

the

...

DatabaseManager

...

or

...

throw

...

SQLException

...

s.

...

There

...

are

...

interfaces

...

for

...

CRUD

...

and

...

link

...

operations

...

in

...

org.dspace.storage.dao

...

that

...

I

...

intend

...

to

...

write

...

some

...

tests

...

to

...

for

...

throwing

...

at

...

all

...

the

...

implementing

...

DAOs.

...

-

...

JR

...

It has often struck me that DSpace would benefit from the use of Data Access Objects (DAO).

...

If

...

nothing

...

else,

...

it

...

would

...

make

...

porting

...

to

...

alternative

...

database

...

platforms

...

far

...

easier;

...

all

...

we

...

would

...

need

...

to

...

do

...

is

...

provide

...

alternative

...

implementations

...

for

...

the

...

DAO

...

interfaces

...

that

...

worked

...

for

...

a

...

given

...

database.

...

To

...

this

...

end,

...

I

...

have

...

broken

...

up

...

some

...

of

...

the

...

core

...

classes

...

in

...

org.dspace.

...

content to

...

use

...

DAOs.

...

As

...

part

...

of

...

the

...

same

...

effort,

...

I

...

have

...

done

...

some

...

work

...

on

...

making

...

the Context less data-layer

...

dependent

...

(by

...

having

...

it

...

hold

...

a

...

org.dspace.storage.dao.GlobalDAO

...

rather

...

than

...

a

...

java.sql.

...

Connection,

...

etc).

...

I've

...

also

...

introduced

...

a proxy for the Item that is a bit smarter about when it retrieves content from the data layer, and an ArchiveManager class that takes care of some core "archive operations" (so that other core classes don't need to).

The process to integration (if at all) would go as follows:

  • Incorporate the new DAO classes into the codebase
  • Refactor org.dspace.content.Item (etc) to use the DAO implementations of the data access methods internally
  • Mark relevant methods in org.dspace.content.Item as @Deprecated
  • Using the compile-time deprecation warnings as a guide, refactor the rest of the code to use the DAOs explicitly rather than hiding the functionality behind existing methods

Without further ado, here is how I have refactored org.dspace.content.Item to use DAOs. A few important things to note:

  • "old" code has been used where possible to avoid re-implementing the wheel
  • I've never liked org.dspace.content.ItemIterator so I've switched to using a "real" Iterator from a List<Item>

For examples of both of these principles, see the implementation of getItems() below. It is a fairly straightforward wrapper for the current Item.findAll(), except that it returns a List<Item> rather than an ItemIterator.


Panel

Contents

Table of Contents
outlinetrue
stylenone

org.dspace.content

The Item class will be broken up into the following classes:

  • org.dspace.content.Item: core class that doesn't go near the database (it doesn't even know about the DAOs); behaves much like the current implementation.
  • org.dspace.content.dao.ItemDAO: interface defining DAO API
  • org.dspace.content.dao.ItemDAOFactory: factory for dishing out implementations of the above interface
  • org.dspace.content.dao.postgres.ItemDAOPostgres: default implementation of the above interface for use with PostgreSQL
  • org.dspace.content.proxy.ItemProxy: subclass of Item that needs to know about the DAO. It will be used for (eg) only loading metadata on demand, to reduce the memory footprint of Items etc.

The following classes have also been introduced:

  • org.dspace.core.ArchiveManager
  • org.dspace.storage.dao.GlobalDAO
  • org.dspace.storage.dao.GlobalDAOFactory
  • org.dspace.storage.dao.GlobalDAOPostgres

Note that it might be preferable to have a more generic implementation of the ItemDAO interface that supports both PostgreSQL and Oracle, but given that one motivation for adopting DAOs is to remove db-specificities from the code making it easier to port, I thought it was sensible to start with just PostgreSQL. Eventually, it ought to be possible to drop in ItemDAOHibernate (etc) implementations that make db portability far easier.

org.dspace.content.Item

Basic implementation of the Item object. This class has been stripped down to remove all contact with the database, including (but not limited to) contstructors, factory methods, update(), delete(), find(), etc. I haven't decided exactly how the Item API will look, but it will probably be much the same as before, only with any of the aforementioned methods. Another key difference is that it will have actual Java objects as member variables instead of pulling everything out of a TableRow.

org.dspace.content.proxy.ItemProxy

This will be a fairly simple proxy implementation. Specifically, it will be closest to being a virtual proxy, in that it will appear to be a regular Item object, but will have a slightly smarter implementation (not loading metadata until requested, keeping track of what has changed to make updates more efficient etc).

Code Block

public class ItemProxy extends Item
{ // Overrides relevant methods of Item. }

org.dspace.content.dao.ItemDAO

This isn't final, but it's a good start.

Code Block

public interface ItemDAO extends ContentDAO
    implements CRUD<Item>, Link<Item, Bundle>
{
    public Item create() throws AuthorizeException;
    public Item retrieve(int id);
    public Item retrieve(UUID uuid);
    public void update(Item item) throws AuthorizeException;
    public void delete(int id) throws AuthorizeException;
    public List<Item> getItems();
    public List<Item> getItemsBySubmitter(EPerson eperson);
    public List<Item> getItemsByCollection(Collection collection);
    public List<Item> getParentItems(Bundle bundle);
}

org.dspace.content.dao.ItemDAOFactory

Code Block

public class ItemDAOFactory
{
    public static ItemDAO getInstance(Context context)
    {   // Eventually, the implementation that is returned will be
        // defined in the configuration.
        return new ItemDAOPostgres(context);
    }
}

org.dspace.content.dao.postgres.ItemDAOPostgres

This is a fairly straightforward implementation of the above interface. As much as possible, code from the original Item class will be used. For instance, this is how getItems() is implemented:

Code Block

public List<Item> getItems()
{
    try
    {
        TableRowIterator tri = DatabaseManager.queryTable(context, "item",
            "SELECT item_id FROM item WHERE in_archive = '1'");

        List<Item> items = new ArrayList<Item>();
        for (TableRow row : tri.toList())
        {
            int id = row.getIntColumn("item_id");
            items.add(retrieve(id));
        }
        return items;
    } catch (SQLException sqle){
        \[\[#org.dspace.content.proxy.ItemProxy\|proxy\]\] for the <code>Item</code> that is a bit smarter about when it retrieves content from the data layer, and an <code>\[\[#org.dspace.core.ArchiveManager\|ArchiveManager\]\]</code> class that takes care of some core "archive operations" (so that other core classes don't need to).
The process to integration (if at all) would go as follows:
\* Incorporate the new DAO classes into the codebase
\* Refactor <code>org.dspace.content.Item</code> (etc) to use the DAO implementations of the data access methods internally
\* Mark relevant methods in <code>org.dspace.content.Item</code> as <code>@Deprecated</code>
\* Using the compile-time deprecation warnings as a guide, refactor the rest of the code to use the DAOs explicitly rather than hiding the functionality behind existing methods
Without further ado, here is how I have refactored <code>org.dspace.content.Item</code> to use DAOs. A few important things to note:
\* "old" code has been used where possible to avoid re-implementing the wheel
\* I've never liked <code>org.dspace.content.ItemIterator</code> so I've switched to using a "real" <code>Iterator</code> from a <code>List<Item></code>
For examples of both of these principles, see the implementation of <code>getItems()</code> \[\[#org.dspace.content.dao.ItemDAOPostgres\|below\]\]. It is a fairly straightforward wrapper for the current <code>Item.findAll()</code>, except that it returns a&nbsp; <code>List<Item></code> rather than an <code>ItemIterator</code>.
== <code>org.dspace.content</code> ==
The <code>Item</code> class will be broken up into the following classes:
\* <code>\[\[#org.dspace.content.Item\|org.dspace.content.Item\]\]</code>: core class that doesn't go near the database (it doesn't even know about the DAOs); behaves much like the current implementation.
\* <code>\[\[#org.dspace.content.dao.ItemDAO\|org.dspace.content.dao.ItemDAO\]\]</code>: interface defining DAO API
\* <code>\[\[#org.dspace.content.dao.ItemDAOFactory\|org.dspace.content.dao.ItemDAOFactory\]\]</code>: factory for dishing out implementations of the above interface
\* <code>\[\[#org.dspace.content.dao.postgres.ItemDAOPostgres\|org.dspace.content.dao.postgres.ItemDAOPostgres\]\]</code>: default implementation of the above interface for use with PostgreSQL
\* <code>\[\[#org.dspace.content.proxy.ItemProxy\|org.dspace.content.proxy.ItemProxy\]\]</code>: subclass of <code>Item</code> that needs to know about the DAO. It will be used for (eg) only loading metadata on demand, to reduce the memory footprint of <code>Item</code>s etc.
The following classes have also been introduced:
\* <code>\[\[#org.dspace.core.ArchiveManager\|org.dspace.core.ArchiveManager\]\]</code>
\* <code>\[\[#org.dspace.storage.dao.GlobalDAO\|org.dspace.storage.dao.GlobalDAO\]\]</code>
\* <code>\[\[#org.dspace.storage.dao.GlobalDAOFactory\|org.dspace.storage.dao.GlobalDAOFactory\]\]</code>
\* <code>\[\[#org.dspace.storage.dao.postgres.GlobalDAOPostgres\|org.dspace.storage.dao.GlobalDAOPostgres\]\]</code>
Note that it might be preferable to have a more generic implementation of the <code>ItemDAO</code> interface that supports both PostgreSQL and Oracle, but given that one motivation for adopting DAOs is to remove db-specificities from the code making it easier to port, I thought it was sensible to start with just PostgreSQL. Eventually, it ought to be possible to drop in <code>ItemDAOHibernate</code> (etc) implementations that make db portability ''far'' easier.
=== <code>org.dspace.content.Item</code> ===
Basic implementation of the <code>Item</code> object. This class has been stripped down to remove all contact with the database, including (but not limited to) contstructors, factory methods, <code>update()</code>, <code>delete()</code>, <code>find()</code>, etc. I haven't decided exactly how the <code>Item</code> API will look, but it will probably be much the same as before, only with any of the aforementioned methods. Another key difference is that it will have actual Java objects as member variables instead of pulling everything out of a <code>TableRow</code>.
=== <code>org.dspace.content.proxy.ItemProxy</code> ===
This will be a fairly simple \[http://en.wikipedia.org/wiki/Proxy_pattern proxy\] implementation. Specifically, it will be closest to being a ''virtual proxy'', in that it will appear to be a regular <code>Item</code> object, but will have a slightly smarter implementation (not loading metadata until requested, keeping track of what has changed to make updates more efficient etc).
&nbsp;public class ItemProxy extends Item
&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp; // Overrides relevant methods of Item. &nbsp;}
=== <code>org.dspace.content.dao.ItemDAO</code> ===
This isn't final, but it's a good start.
&nbsp;public interface ItemDAO extends ContentDAO
&nbsp;&nbsp;&nbsp;&nbsp; implements CRUD<Item>, Link<Item, Bundle>
&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp; public Item create(); throws AuthorizeException &nbsp;&nbsp;&nbsp;&nbsp; public Item retrieve(int id); &nbsp;&nbsp;&nbsp;&nbsp; public Item retrieve(UUID uuid); &nbsp;&nbsp;&nbsp;&nbsp; public void update(Item item); throws AuthorizeException &nbsp;&nbsp;&nbsp;&nbsp; public void delete(int id); throws AuthorizeException &nbsp;&nbsp;&nbsp;&nbsp; public List<Item> getItems(); &nbsp;&nbsp;&nbsp;&nbsp; public List<Item> getItemsBySubmitter(EPerson eperson); &nbsp;&nbsp;&nbsp;&nbsp; public List<Item> getItemsByCollection(Collection collection); &nbsp;&nbsp;&nbsp;&nbsp; public List<Item> getParentItems(Bundle bundle); &nbsp;}
=== <code>org.dspace.content.dao.ItemDAOFactory</code> ===
&nbsp;public class ItemDAOFactory
&nbsp;{
&nbsp;&nbsp;&nbsp;&nbsp; public static ItemDAO getInstance(Context context)
&nbsp;&nbsp;&nbsp;&nbsp; { &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; // Eventually, the implementation that is returned will be &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; // defined in the configuration. &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return new ItemDAOPostgres(context); &nbsp;&nbsp;&nbsp;&nbsp; }
&nbsp;}
=== <code>org.dspace.content.dao.postgres.ItemDAOPostgres</code> ===
This is a fairly straightforward implementation of the above interface. As much as possible, code from the original <code>Item</code> class will be used. For instance, this is how <code>getItems()</code> is implemented:
&nbsp;public List<Item> getItems()
&nbsp;{
&nbsp;&nbsp;&nbsp;&nbsp; try
&nbsp;&nbsp;&nbsp;&nbsp; {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; TableRowIterator tri = DatabaseManager.queryTable(context, "item",
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "SELECT item_id FROM item WHERE in_archive = '1'");
&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; List<Item> items = new ArrayList<Item>();
&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; for (TableRow row : tri.toList())
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; { &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; int id = row.getIntColumn("item_id"); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; items.add(retrieve(id)); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }
&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return items;
&nbsp;&nbsp;&nbsp;&nbsp; }
&nbsp;&nbsp;&nbsp;&nbsp; catch (SQLException sqle)
&nbsp;&nbsp;&nbsp;&nbsp; { &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; // Need to think more carefully about how we deal with SQLExceptions &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
        throw new RuntimeException(sqle); &nbsp;&nbsp;&nbsp;&nbsp;
    }
&nbsp;}

Some

...

changes

...

have

...

been

...

made

...

to

...

eliminate

...

ItemIterator,

...

and

...

to

...

generally

...

make

...

things

...

a

...

little

...

more

...

consistent

...

with

...

the

...

rest

...

of

...

the

...

code

...

(this

...

looks

...

almost

...

identical

...

to,

...

eg,

...

CollectionDAO.getCollections()

...

.

org.dspace.core

org.dspace.core.

...

ArchiveManager

The idea behind this class came from the realisation that Item.withdraw()

...

and Item.reinstate()

...

don't

...

really

...

make

...

sense.

...

What

...

I'd

...

much

...

rather

...

do

...

is

...

call

...

(eg)

...

ArchiveManager.withdrawItem(Item

...

item)

...

.

...


I've

...

been

...

thinking

...

that

...

the

...

ArchiveManager could

...

be

...

used

...

for

...

certain

...

maintenance

...

operations

...

as

...

well,

...

such

...

as

...

moving Items between Collections, and maybe acting as a wrapper for the CommunityFiliator.

Code Block

 <code>Item</code>s between <code>Collection</code>s, and maybe acting as a wrapper for the <code>CommunityFiliator</code>.
&nbsp;public class ArchiveManager
&nbsp;{
&nbsp;&nbsp;&nbsp;&nbsp;{
    public static void withdrawItem(Context context, Item item)
&nbsp;&nbsp;&nbsp;&nbsp; { &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;    {        // ... &nbsp;&nbsp;&nbsp;&nbsp;      }
&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp;
    public static void reinstateItem(Context context, Item item)
&nbsp;&nbsp;&nbsp;&nbsp; {&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;    {        // ...&nbsp;&nbsp;&nbsp;&nbsp; }
&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp;      }
    public static void moveItem(Context context,
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Item item, Collection source, Collection dest)
&nbsp;&nbsp;&nbsp;&nbsp; { &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;    {        // ... &nbsp;&nbsp;&nbsp;&nbsp;      }
&nbsp;}
== <code>org

org.dspace.

...

storage

org.dspace.storage.dao.GlobalDAO

As suggested by Richard Jones, there probably ought to be a top-level general-purpose DAO interface that has implementations for the various storage mechanisms (GlobalDAOPostgres etc). The idea is to have this top-level object capture any implementation-specific details in a single top-level object, rather than in every Postgres DAO implementation. For example, with the current database "abstraction layer", the top-level implementation of GlobalDAO understands the Context object, whereas a Hibernate implementation would know what a SessionFactory is.

Code Block

public interface GlobalDAO
{
   GlobalDAO</code> ===
As suggested by Richard Jones, there probably ought to be a top-level general-purpose DAO interface that has implementations for the various storage mechanisms (<code>GlobalDAOPostgres</code> etc). The idea is to have this top-level object capture any implementation-specific details in a single top-level object, rather than in every Postgres DAO implementation. For example, with the current database "abstraction layer", the top-level implementation of <code>GlobalDAO</code> understands the <code>Context</code> object, whereas a Hibernate implementation would know what a <code>SessionFactory</code> is.
&nbsp;public interface GlobalDAO
&nbsp;{ &nbsp;&nbsp;&nbsp;&nbsp; // The following methods actually currently throw SQLExceptions to &nbsp;&nbsp;&nbsp;&nbsp;
    // keep things simple, but in future SQLExceptions should be &nbsp;&nbsp;&nbsp;&nbsp;
    // eliminated from any code that doesn't directly touch a database. &nbsp;&nbsp;&nbsp;&nbsp;
    public void startTransaction() throws GlobalDAOException; &nbsp;&nbsp;&nbsp;&nbsp;
    public void endTransaction() throws GlobalDAOException; &nbsp;&nbsp;&nbsp;&nbsp;
    public void saveTransaction() throws GlobalDAOException; &nbsp;&nbsp;&nbsp;&nbsp;
    public void abortTransaction(); &nbsp;&nbsp;&nbsp;&nbsp;;
    public boolean transactionOpen(); &nbsp;&nbsp;&nbsp;&nbsp;
    @Deprecated Connection getConnection(); &nbsp;
}
=== <code>org

org.dspace.storage.dao.

...

GlobalDAOFactory

Super-simple

...

GlobalDAO factory.

...

org.dspace.storage.dao.GlobalDAOPostgres

Implementation of the GlobalDAO interface for PostgreSQL.

Code Block

public class GlobalDAOPostgres implements GlobalDAO
{
    private Connection connection;

    // ...

    public void startTransaction()
    {
         connection = DatabaseManager.getConnection();
        GlobalDAOPostgres</code> ===
Implementation of the <code>GlobalDAO</code> interface for PostgreSQL.
&nbsp;public class GlobalDAOPostgres implements GlobalDAO
&nbsp;{
&nbsp;&nbsp;&nbsp;&nbsp; private Connection connection;
&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp; // ...
&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp; public void startTransaction()
&nbsp;&nbsp;&nbsp;&nbsp; { &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; connection = DatabaseManager.getConnection(); &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; connection.setAutoCommit(false); &nbsp;&nbsp;&nbsp;&nbsp; }
&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp;
    // ...
&nbsp;}
\[\[Category:Refactoring\]\]