|
[dira05]
|
Karl Dias, Mark Ramacher, Uri Shaft, Venkateshwaran Venkataramani, and Graham
Wood.
Automatic performance diagnosis and tuning in Oracle.
In Second Biennial Conference on Innovative Data Systems
Research (CIDR'05), pages 84-94, 2005.
[ bib |
.pdf |
.pdf ]
|
|
[nath05]
|
Dushyanth Narayanan, Eno Thereska, and Anastassia Ailamaki.
Continuous resource monitoring for self-predicting DBMS.
In International Symposium on Modeling, Analysis, and Simulation
of Computer and Telecommunication Systems (MASCOTS'05), pages 239-248,
2005.
[ bib |
.pdf ]
|
|
[cosu04]
|
Glenn Colaco and Darrell Suggs.
Database performance with NAS: Optimizing Oracle on NFS.
Technical Report TR-3322, Network Applicance Corp., May 2004.
[ bib |
.pdf |
.pdf ]
|
|
[brke03]
|
R. Braumandl, A. Kemper, and D. Kossmann.
Quality of service in an information economy.
ACM Transactions on Internet Technology, 3(4):291-333, 2003.
[ bib |
.pdf ]
Discusses distributed query processing with QoS guarantees for each
query (not query class). Query evaluation plans are adapted on-the-fly.
|
|
[dies03]
|
Yixin Diao, Frank Eskesen, Steven Froehlich, Joseph L. Hellerstein, Lisa F.
Spainhower, and Maheswaran Surendra.
Generic online optimization of multiple configuration parameters with
application to a database server.
In Proc. 14th IFIP/IEEE International Workshop on Distributed
Systems: Operations and Management (DSOM), number 2867 in Lecture Notes in
Computer Science, pages 3-15. Springer-Verlag, 2003.
[ bib |
.pdf ]
A generic approach to feedback of control of systems based on the
Nelder-Mead simplex method. Application is to buffer pool configuration.
|
|
[paro03]
|
Sujay Parekh, Kevin Rose, Joseph L. Hellerstein, Sam Lightstone, Matthew Huras,
and Victor Chang.
Managing the performance impact of administrative utilities.
In Self-Managing Distributed Systems - 14th IFIP/IEEE
International Workshop on Distributed Systems: Operations and Management
(DSOM 2003), number 2867 in Lecture Notes in Computer Science.
Springer-Verlag, 2003.
[ bib |
.pdf ]
|
|
[wabu02]
|
Wenguang Wang and Rick Bunt.
A self-tuning page cleaner for DB2.
In International Workshop on Modeling, Analysis, and Simulation
of Computer and Telecommunication Systems (MASCOTS), October 2002.
[ bib |
.pdf |
.pdf ]
|
|
[mapo02b]
|
P. Martin, W. Powley, H. Li, and K. Romanufa.
Managing database server performance to meet QoS requirements in
electronic commerce systems.
International Journal on Digital Libraries, 3(4):316-324,
2002.
[ bib |
.pdf |
.pdf ]
Describes the Quartermaster system, with application to database
buffer configuration.
|
|
[zama02]
|
Hamzeh Zawawy, Pat Martin, and Hossam Hassanein.
Capacity planning for database management systems using analytical
modeling.
In Proc. CASCON, 2002.
[ bib ]
|
|
[chwe00]
|
Surajit Chaudhuri and Gerhard Weikum.
Rethinking database system architecture: Towards a self-tuning
RISC-Style database system.
In Proc. International Conference on Very Large Data Bases,
pages 1-10, 2000.
[ bib |
.pdf ]
|
|
[chch99]
|
Surajit Chaudhuri, Eric Christensen, Goetz Graefe, Vivek R. Narasayya, and
Michael J. Zwilling.
Self-tuning technology in Microsoft SQL Server.
In Bulletin of the IEEE Technical Committee on Data
Engineering [loch99], pages 20-26.
[ bib ]
|
|
[scva99]
|
K. Bernhard Schiefer and Gary Valentin.
DB2 Universal Database performance tuning.
In Bulletin of the IEEE Technical Committee on Data
Engineering [loch99], pages 12-19.
[ bib ]
|
|
[loch99]
|
David Lomet and Surajit Chaudhuri, editors.
Bulletin of the IEEE Technical Committee on Data Engineering,
volume 22(2), June 1999.
[ bib |
.pdf ]
Special Issue on Self-Tuning Databases and Application Tuning
|
|
[vibr98]
|
Radek Vingralek, Yuri Breitbart, and Gerhard Weikum.
Snowball: Scalable storage on networks of workstations with balanced
load.
Distributed and Parallel Databases, 6(2):117-156, April 1998.
[ bib |
.ps.gz ]
|
|
[scwe98]
|
Peter Scheuermann, Gerhard Weikum, and Peter Zabback.
Data partitioning and load balancing in parallel disk systems.
The VLDB Journal, 7(1):48-66, 1998.
[ bib |
.pdf ]
|
|
[brca96]
|
Kurt P. Brown, Michael J. Carey, and Miron Livny.
Goal-oriented buffer management revisited.
In Proceedings of the 1996 ACM SIGMOD International Conference
on Management of Data, pages 353-364, June 1996.
[ bib |
.pdf ]
Describes the class fencing technique buffer management technique.
|
|
[gaio96]
|
Minos N. Garofalakis and Yannis E. Ioannidis.
Multi-dimensional resource scheduling for parallel queries.
In Proc. ACM SIGMOD Int'l Conference on Management of Data
(SIGMOD'96), pages 365-376, June 1996.
[ bib |
DOI |
.pdf ]
|
|
[dagr95]
|
Diane L. Davison and Goetz Graefe.
Dynamic resource brokering for multi-user query execution.
In Proc. ACM SIGMOD Int'l Conference on Management of Data
(SIGMOD'95), pages 281-292, 1995.
[ bib |
DOI |
.pdf ]
|
|
[logh94]
|
David Lomet and Shahram Ghandeharizadeh, editors.
Bulletin of the IEEE Technical Committee on Data Engineering,
volume 17(3), September 1994.
[ bib |
.pdf |
.pdf ]
Special Issue on Data Placement for Parallelism
|
|
[scwe94]
|
Peter Scheuermann, Gerhard Weikum, and Peter Zabback.
“disk cooling” in parallel disk systems.
In IEEE Data Engineering Bulletin [logh94], pages
29-40.
[ bib ]
|
|
[brme94]
|
Kurt P. Brown, Manish Mehta, Michael J. Carey, and Miron Livny.
Towards automated performance tuning for complex workloads.
In Proc. International Conference on Very Large Data Bases,
pages 72-84, 1994.
[ bib |
.pdf ]
|
|
[scwe93]
|
Peter Scheuermann, Gerhard Weikum, and Peter Zabback.
Adaptive load balancing in disk arrays.
In Proceedings of the 4th International Conference on
Foundations of Data Organization and Algorithms (FODO'93), pages 345-360,
1993.
[ bib |
.ps ]
|
|
[hewa91]
|
Hans-Ulrich Heiss and Roger Wagner.
Adaptive load control in transaction processing systems.
In 17th International Conference on Very Large Data Bases,
pages 47-54, September 1991.
[ bib |
.PDF |
.pdf ]
Feedback control of concurrency. Assumes that the relationship between
level of concurrency and throughput is unimodal. Proposes two control
mechanisms. Incremental steps is a proportional control mechanism.
Parabolic approximation mechanism fits recent concurrency/throughput
measurements to a parabola and uses the peak of the parabola to choose
a target concurrency level. Considers the use of admission control
and transaction aborts to achieve the target level of concurrency.
|
|
[mowe91]
|
Axel Mönkeberg and Gerhard Weikum.
Conflict-driven load control for the avoidance of data-contention
thrashing.
In Proceedings of the Seventh International Conference on Data
Engineering, pages 632-639, April 1991.
[ bib |
.pdf ]
|
|
[ngfa91]
|
Raymond Ng, Christos Faloutsos, and Timos Sellis.
Flexible buffer allocation based on marginal gains.
In Proc. ACM SIGMOD Int'l Conference on Management of Data
(SIGMOD'91), pages 387-396, 1991.
[ bib |
DOI |
.pdf ]
|
|
[cakr90]
|
Michael J. Carey, Sanjay Krishnamurthi, and Miron Livny.
Load control for locking: The 'half-and-half' approach.
In Proceedings of the Ninth ACM SIGACT-SIGMOD-SIGART Symposium
on Principles of Database Systems, pages 72-84, April 1990.
[ bib |
.pdf ]
|
|
[lafo88]
|
S. Lafortune.
Modeling and analysis of transaction execution in database systems.
IEEE Transactions on Automatic Control, 33(5):439-447, May
1988.
[ bib |
.pdf ]
Formulates the concurrency control problem as a control problem for
discrete-event dynamical systems.
|
|
[brch07]
|
Nicolas Bruno and Surajit Chaudhuri.
An online approach to physical design tuning.
In Proc. International Conference on Data Engineering
(ICDE'07), pages 826-835, April 2007.
[ bib |
.pdf ]
|
|
[agch06]
|
Sanjay Agrawal, Eric Chu, and Vivek Narasayya.
Automatic physical design tuning: workload as a sequence.
In Proc. ACM SIGMOD International Conference on Management of
Data (SIGMOD'06), pages 683-694, 2006.
[ bib |
.pdf ]
Considers a version of the database physical design problem in which
the input is a sequence of queries and updates. The goal is to recommend
a target physical design for each query or update in the sequence,
taking into account both the effect of the physical design on the
cost of executing the query or update and the cost of changing the
physical design.
|
|
[brch06]
|
Nicolas Bruno and Surajit Chaudhuri.
To tune or not to tune? a lightweight physical design alerter.
In Proc. International Conference on Very Large Data Bases
(VLDB'06), pages 499-510, 2006.
[ bib |
.pdf |
.pdf ]
|
|
[brch06b]
|
Nicolas Bruno and Surajit Chaudhuri.
Physical design refinement: The merge-reduce approach.
In Proc. International Conference on Extending Database
Technology (EDBT'06), number 3896 in Lecture Notes in Computer Science,
pages 386-404. Springer-Verlag, 2006.
[ bib |
.pdf ]
|
|
[brch05]
|
Nicolas Bruno and Surajit Chaudhuri.
Automatic physical database tuning: A relaxation-based approach.
In Proceedings of the 2005 ACM SIGMOD International Conference
on Management of Data (SIGMOD'05), 2005.
[ bib ]
Assumes that the optimizer requests indexes that it thinks might
be useful for a particular query. These requested indexes form the
initial configuration, which is then modified to meet a space constraint.
|
|
[coba05]
|
Mariano P. Consens, Denilson Barbosa, Adrian M. Teisanu, and Laurent Mignet.
Goals and benchmarks for autonomic configuration recommenders.
In Proc. ACM SIGMOD International Conference on Management of
Data (SIGMOD'05), 2005.
[ bib |
.pdf ]
Uses random queries that are generated from templates. Templates
constrain generated queries to ensure that they are reasonable and
that they can benefit from indexing.
|
|
[abha04]
|
Ashraf Aboulnaga, Peter J. Haas, Mokhtar Kandil, Sam Lightstone, Guy M. Lohman,
Volker Markl, Ivan Popivanov, and Vijayshankar Raman.
Automated statistics collection in DB2UDB.
In International Conference on Very Large Data Bases (VLDB
'04), pages 1146-1157, August 2004.
[ bib |
.pdf |
.pdf ]
|
|
[agch04]
|
Sanjay Agrawal, Surajit Chaudhuri, Lubor Kollór, Arunprasad P. Marathe,
Vivek R. Narasayya, and Manoj Syamala.
Database tuning advisor for Microsoft SQL server.
In International Conference on Very Large Data Bases (VLDB
'04), pages 1110-1121, 2004.
[ bib |
.pdf ]
|
|
[zizu04]
|
Daniel C. Zilio, Calisto Zuzarte, Sam Lightstone, Wenbin Ma, Guy M. Lohman,
Roberta Cochrane, Hamid Pirahesh, Latha S. Colby, Jarek Gryz, Eric Alton,
Dongming Liang, and Gary Valentin.
Recommending materialized views and indexes with IBM DB2 design
advisor.
In IEEE International Conference on Autonomic Computing
(ICAC'04), pages 180-188, 2004.
[ bib |
.pdf ]
General approach is to generate candidate MVs and indexes based on
the workload, and then filter to meet a space constraint. Multiquery
optimization is used when generating candidate MVs.
|
|
[agch03]
|
Sanjay Agrawal, Surajit Chaudhuri, Abhinandan Das, and Vivek Narasayya.
Automating layout of relational databases.
In International Conference on Data Engineering (ICDE'03),
pages 607-618, 2003.
[ bib |
.pdf ]
|
|
[razh02]
|
Jun Rao, Chun Zhang, Guy M. Lohman, and Nimrod Megiddo.
Automating physical database design in a parallel database.
In Proc. ACM SIGMOD International Conference on Management of
Data, pages 558-569, 2002.
[ bib |
.pdf ]
Automatic hash-based relation partitioning in shared nothing systems.
|
|
[agch01]
|
Sanjay Agrawal, Surajit Chaudhuri, and Vivek R. Narasayya.
Materialized view and index selection tool for Microsoft SQL
Server 2000.
In Proc. ACM SIGMOD International Conference on Management of
Data, page 608, 2001.
[ bib ]
This is a one-page description of a SIGMOD demo.
|
|
[chna01]
|
Surajit Chaudhuri and Vivek Narasayya.
Automating statistics management for query optimizers.
IEEE Transactions on Knowledge and Data Engineering,
13(1):7-20, 2001.
[ bib |
.pdf ]
Hardcopy on file. This is the journal version of [chna00]. A
variety of heuristic techniques for choosing minimal sets of heuristics
in such a way that the quality of plans produced by the optimizer
is not reduced.
|
|
[agch00]
|
Sanjay Agrawal, Surajit Chaudhuri, and Vivek R. Narasayya.
Automated selection of materialized views and indexes in SQL
databases.
In Proc. International Conference on Very Large Data Bases,
pages 496-505, 2000.
[ bib |
.pdf ]
|
|
[chna00]
|
Surajit Chaudhuri and Vivek Narasayya.
Automating statistics management for query optimizers.
In 16th International Conference on Data Engineering, pages
339-348, 2000.
[ bib ]
The journal version of this paper is [chna01].
|
|
[leki00]
|
Mong Li Lee, Masaru Kitsuregawa, Beng Chin Ooi, Kian-Lee Tan, and Anirban
Mondal.
Towards self-tuning data placement in parallel database systems.
In Proceedings of the 2000 ACM SIGMOD International Conference
on Management of Data, pages 225-236, 2000.
[ bib |
.pdf ]
Adaptive declustering in shared-nothing systems using a two-level,
tree-structured index.
|
|
[vazu00]
|
Gary Valentin, Michael Zuliani, Daniel C. Zilio, Guy M. Lohman, and Alan
Skelley.
DB2 Advisor: An optimizer smart enough to recommend its own
indexes.
In 16th International Conference on Data Engineering, pages
101-110, 2000.
[ bib |
.pdf ]
General recommendation method is to add virtual indexes to the schema,
optimize the query, and check whether any virtual statistics are
used in the optimal plan. Statistics for virtual indexes are inferred
from existing column statistics. To recommend indexes for a workload,
recommend for each query in the workload in sequence and then greedily
select a subset of the recommended indexes.
|
|
[chna98]
|
Surajit Chaudhuri and Vivek R. Narasayya.
Autoadmin 'what-if' index analysis utility.
In Proc. ACM SIGMOD International Conference on Management of
Data, pages 367-378, 1998.
[ bib |
.pdf ]
How to implement hypothetic database configurations, so that workload
costs can be estimated under those configurations. Configuration
includes hypothetic indexes and statistics that allow the optimizer
to decide whether such an index should be used. Proposes that sampling
be used to collect the statistics. Allows specification of scale
factors so configurations with larger/smaller databases can be simulated.
Presents an analysis interface that supports workload analysis and
configuration analysis for current and hypothetical configurations.
|
|
[chna97]
|
Surajit Chaudhuri and Vivek R. Narasayya.
An efficient cost-driven index selection tool for Microsoft SQL
Server.
In Proc. International Conference on Very Large Data Bases,
pages 146-155, 1997.
[ bib |
.pdf ]
Assumes that an upper bound is given on the number of indexes. Workload
is specified as a set of SQL DML statements, including insert, delete
and update. Search space includes both single and multi-attribute
indexes. Index configurations are evaluated by the DBMS optimizer,
and several techniques are used to reduce the number of configurations
for which optimizer evaluation is required. To generate a set of
candidate indexes, this method determines an optimal index configuration
independently for each query in the workload. The initial candidate
set is then taken as the union of the indexes in the single-query
optimal configurations. A hybrid exhaustive/greedy approach is used
to control search. To find a k-index configuration, first find the
optimal m-index configuration (m <= k) using exhaustive search, then
add k-m indexes greedily. Multi-column indexes are handled by first
finding an good configuration with single-column indexes, then generating
and adding a set of candidate two-column indexes, and then rerunning
the optimizer on the new candidate set. This is repeated to handle
indexes with more than two columns.
|
|
[logh94]
|
David Lomet and Shahram Ghandeharizadeh, editors.
Bulletin of the IEEE Technical Committee on Data Engineering,
volume 17(3), September 1994.
[ bib |
.pdf |
.pdf ]
Special Issue on Data Placement for Parallelism
|
|
[fisc88]
|
Sheldon J. Finkelstein, Mario Schkolnick, and Paolo Tiberio.
Physical database design for relational databases.
ACM Transactions on Database Systems, 13(1):91-128, 1988.
[ bib ]
|
|
[come78]
|
Douglas Comer.
The difficulty of optimum index selection.
ACM Transactions on Database Systems, 3(4):440-445, 1978.
[ bib ]
|
|
[soch08]
|
Gokul Soundararajan, Jin Chen, Mohamed Sharaf, and Cristiana Amza.
Dynamic partitioning of the cache hierarchy in shared data centers.
In Proc. Int'l Conference on Very Large Data Bases (VLDB'08),
August 2008.
[ bib |
.pdf ]
|
|
[yafa08]
|
Gala Yadgar, Michael Factor, Kai Li, and Assaf Schuster.
Mc2: Multiple clients on a multilevel cache.
In Proc. Int'l Conference on Distributed Computing Systems
(ICDCS'08), June 2008.
[ bib |
.pdf |
.pdf ]
Extends Karma ([yafa07]) two a two-tier scenario in which multiple
clients which share
data also share a second-tier cache. Space is partitioned among clients,
with one additional
partition used for shared data. Within each partition, space is managed
using Karma.
|
|
[gama08]
|
Charles Garrod, Amit Manjhi, Anastasia Ailamaki, Bruce Maggs, Todd Mowry,
Christopher Olston, and Anthony Tomasic.
Scalable query result caching for web applications.
Proc. of the VLDB Endowment, 1(1):550-561, 2008.
[ bib |
DOI |
.pdf ]
|
|
[gill08]
|
Binny Gill.
On multi-level exclusive caching: Offline optimality and why
promotions are better than demotions.
In Proc. USENIX Conference on File and Storage Technologies
(FAST'08), pages 49-65, 2008.
[ bib |
.pdf |
.pdf ]
Includes lower and upper bounds on optimal off-line performance for
multi-level
caches.
Proposes a scheme called PROMOTE for managing multi-tier caches. As
a requested page
is passed up through the cache tiers, each cache decides whether it
will be responsible for
caching the page. Once a cache has decided to cache the page, it notifies
the higher level
caches of this by attaching a flag to the page as it is passed up
through tiers. The higher level
caches then do not cache the page. This enforces exclusiveness among
the caches.
Pages that are repeatedly requested should tend to migrate to higher
level caches.
Behaviour on writes is not specified, e.g, can a write affect which
cache is responsible for
a particular page.
This policy requires modification of the caching policies at every
tier, as each tier must
abide by caching decisions made at lower tiers, and must inform upper
tiers of its
decisions.
|
|
[yafa07]
|
Gala Yadgar, Michael Factor, and Assaf Schuster.
Karma: Know-it-all replacement for a multilevel cache.
In Proc. USENIX Conference on File and Storage Technologies
(FAST'07), February 2007.
[ bib |
.pdf |
.pdf ]
Assumes caches support read, read-save, and demote. Requires that
blocks by grouped into
ranges by the application. Application must also specify the frequency
of access and access
pattern for each range. Each range is assigned some space in some
cache in the hierarchy,
and each range's space is then managed using a separate replacement
policy. Experiments used
PostgreSQL explain to generate range and access pattern hints - however,
explain data covers
the situation before the DBMS cache.
|
|
[fasc06]
|
Michael Factor, Assaf Schuster, and Gala Yadgar.
Multilevel cache management based on application hints.
Technical Report CS-2006-02, Technion Computer Science Department,
2006.
[ bib |
.pdf |
.pdf ]
|
|
[chzh05]
|
Zhifeng Chen, Yan Zhang, Yuanyuan Zhou, Heidi Scott, and Berni Schiefer.
Empirical evaluation of multi-level buffer cache collaboration for
storage systems.
In Proceedings of the International Conference on Measurements
and Modeling of Computer Systems (SIGMETRICS'05), pages 145-156, 2005.
[ bib |
.pdf ]
Compares “hierarchically aware” approachs to “aggressively-collaborative”
approaches. The former are transparent to the storage client (e.g.,
the DBMS), the latter are not. Aggressively-collaborative approaches
include two types of hint-passing: access patterns and application
semantics. Example of semantic hint is a hint that a block will be
read only one time. Even more aggressive is content-aware caching,
where the caches try explicitly to avoid duplication. Also considers
some additional optimizations: Quick eviction of duplicated blocks
(DU) removes pages from the buffer when they are read, Semantics-Directed
Caching (SE) uses “importance” values from the storage client (in
an ill-specified way) to affect buffering at the storage server.
General conclusion is that the agressively-collaborative approaches
do not help much compared to the hierarchically-aware approaches.
|
|
[fitz04]
|
Brad Fitzpatrick.
Distributed caching with memcached.
Linux Journal, 2004(124):5, August 2004.
[ bib |
http |
.html ]
|
|
[zhch04]
|
Yuanyuan Zhou, Zhifeng Chen, and Kai Li.
Second-level buffer cache management.
IEEE Transactions on Parallel and Distributed Systems, 15(7),
July 2004.
[ bib |
.ps ]
Presents a trace-based characterization of access patterns for second-tier
(L2) buffer caches, noting that L2 cache reference streams don't
exhibit any small reuse distances. Presents the MQ algorithm for
managing L2 cache. MQ uses multiple LRU queues. Pages are promoted
to higher queues according to frequency of reference. Replacements
happen in low queues first. An aging mechanism is used to demote
pages that cool down. Also describes so-called global replacement
algorithms, in which the L2 cache is informed when replacements are
made at L1. Evaluation is by trace-driven simulation and also by
experiment with an storage system cache implementation.
|
|
[bamo04]
|
S. Bansal and D. Modha.
CAR: Clock with adaptive replacement.
In Proc. of the 3nd USENIX Symposium on File and Storage
Technologies (FAST'04), March 2004.
[ bib |
.pdf |
.pdf ]
|
|
[bopl04]
|
R. Bonilla-Lucas, Peter Plachta, Aamer Sachedina, Daniel
Jiménez-González, Calisto Zuzarte, and Josep-Lluis Larriba-Pey.
Characterization of the data access behavior for TPC-C traces.
In Proc. IEEE International Symposium on Performance Analysis of
Systems and Software, pages 115-122, March 2004.
[ bib |
.pdf ]
|
|
[jizh04]
|
Song Jiang and Xiaodong Zhang.
ULC: A file block placement and replacement protocol to effectively
exploit hierarchical locality in multi-level buffer caches.
In Proc. 24th International Conference on Distributed Computing
Systems (ICDCS'04), pages 168-177, 2004.
[ bib |
.pdf ]
|
|
[chzh03]
|
Zhifeng Chen, Yuanyuan Zhou, and Kai Li.
Eviction-based cache placement for storage caches.
In Proceedings of the USENIX 2003 Annual Technical Conference,
pages 269-282, June 2003.
[ bib |
.pdf |
.pdf ]
Eviction-based placement means that a page is loaded into the storage
cache when it is evicted from the storage client's cache, as opposed
to when it is requested by the client. Proposes tracking evictions
transparently by monitoring the target addresses of read requests.
Evicted pages are prefetched from disk into the storage system cache
at the time of predicted eviction from the storage client cache.
|
|
[albo03]
|
Mehmet Altinel, Christof Bornhovd, Sailesh Krishnamurthy, C. Mohan, Hamid
Pirahesh, and Berthold Reinwald.
Cache tables: Paving the way for an adaptive database cache.
In Proc. International Conference on Very Large Data Bases,
pages 718-729, 2003.
[ bib |
.pdf |
.pdf ]
|
|
[medh03]
|
Nimrod Megiddo and Dharmendra S. Modha.
ARC: A self-tuning, low overhead replacement cache.
In Proc. USENIX Conference on File and Storage Technology
(FAST'03), 2003.
[ bib |
.pdf |
.pdf ]
ARC maintains two LRU queues, one for pages that have been referenced
once and one for pages that have been referenced more than once.
For each queue, there is also a ghost queue that tracks additional
pages. The sizes of the two queues are adjusted dynamically. A hit
in the single-reference ghost queue causes the single-reference queue
to grow. A hit in the multi-reference ghost queue causes the multi-referenced
queue to grow.
|
|
[wowi02]
|
Theodore M. Wong and John Wilkes.
My cache or yours? making storage more exclusive.
In USENIX Annual Technical Conference (USENIX 2002), pages
161-175, June 2002.
[ bib |
.pdf |
.pdf ]
Notes that storage system caches are often LRU-based, and points
out the multi-tier cache inclusion problem: that a second-tier LRU
cache behind a first-tier LRU cache will contain many of the same
pages as the first-tier cache. Defines a “demote” operation to
deal with this problem. Demote sends to the second tier a block that
has been evicted from the first tier. Argues that the cost of sending
these blocks to the second tier is low because storage networks have
lots of bandwidth. Also defines a “demote” buffer management policy
for tier two. This puts blocks read by tier one at the LRU end of
the buffer (like our +read-read policy) and puts blocks demoted by
the first tier at the MRU end.
|
|
[aram02]
|
Ismail Ari, Ahmed Amer, Robert Gramacy, Ethan L. Miller, Scott Brandt, and
Darrell D. E. Long.
ACME: Adaptive caching using multiple experts.
In Workshop on Distributed Data and Structures 4 (WDAS), pages
143-158. Carleton Scientific, March 2002.
[ bib |
.pdf |
.pdf ]
|
|
[mapo02]
|
Patrick Martin, Wendy Powley, and Xiaoyi Xu.
Configuring buffer pools in DB2 UDB.
In Proc. CASCON, 2002.
[ bib ]
|
|
[mali00]
|
Patrick Martin, Hoi-Ying Li, Min Zheng, Keri Romanufa, and Wendy Powley.
Dynamic reconfiguration algorithm: Dynamically tuning multiple buffer
pools.
In 11th International Conference on Database and Expert Systems
Applications (DEXA), pages 92-101, 2000.
[ bib |
.pdf ]
|
|
[saha00]
|
Prasenjit Sarkar and John H. Hartman.
Hint-based cooperative caching.
ACM Transactions on Computer Systems, 18(4):387-419, 2000.
[ bib |
.pdf ]
Describes a cooperative two-level caching system in which first-level
caches may fetch blocks from other first-level caches. A hint is
potentially inaccurate information about the locations of blocks
in the first-level caches.
|
|
[phgo95]
|
Vidyadhar Phalke and Bhaskarpillai Gopinath.
An inter-reference gap model for temporal locality in program
behavior.
In Proceedings of the 1995 ACM SIGMETRICS Joint International
Conference on Measurement and Modeling of Computer Systems, pages 291-300,
1995.
[ bib |
.pdf ]
|
|
[josh94]
|
Theodore Johnson and Dennis Shasha.
2Q: A low overhead high performance buffer management replacement
algorithm.
In Proc. International Conference on Very Large Data Bases
(VLDB'94), pages 439-450, 1994.
[ bib |
.PDF ]
|
|
[onon93]
|
Elizabeth J. O'Neil, Patrick E. O'Neil, and Gerhard Weikum.
The LRU-K page replacement algorithm for database disk buffering.
In Proceedings of the ACM SIGMOD International Conference on
Management of Data (SIGMOD'93), pages 297-306, 1993.
[ bib |
.pdf ]
|
|
[muho92]
|
D. Muntz and P. Honeyman.
Multi-level caching in distributed file systems - or - your cache
ain't nuthin' but trash.
In Proceedings of the USENIX Winter Conference, pages 305-313,
January 1992.
[ bib |
.pdf ]
Simulation study of bi-level LRU caches for a distributed file system.
Found that increasing client cache size quickly reduces the hit rate
of the second tier cache.
|
|
[pazd91]
|
Mark Palmer and Stanley Zdonik.
Fido: A cache that learns to fetch.
In Proc. Int'l Conference on Very Large Data Bases, 1991.
[ bib |
.PDF ]
|
|
[duha82]
|
A. H. Duke, M. H. Hartung, J. D. Huntley, and F. J. Marschner.
Buffered writing in a peripheral storage hierarchy.
IBM Technical Disclosure Bulletin, 25(4):2075-2076, September
1982.
[ bib ]
Describes a scheme for synchronizing writes in batches, rather than
individually.
|
|
[bela66]
|
L. A. Belady.
A study of replacement algorithms for a virtual-storage computer.
IBM Systems Journal, 5(2):78-101, 1966.
[ bib |
.pdf ]
|
|
[ceca08]
|
Emmanuel Cecchet, George Candea, and Anastasia Ailamaki.
Middleware-based database replication: The gaps between theory and
practice.
In Proc. ACM SIGMOD Int'l Conference on Management of Data
(SIGMOD'08), pages 739-752, 2008.
[ bib |
http |
.pdf ]
|
|
[eldr07]
|
Sameh Elnikety, Steven Dropsho, and Willy Zwaenepoel.
Tashkent+: Memory-aware load balancing and update filtering in
replicated databases.
In Proc. EuroSys 2007, pages 399-412, March 2007.
[ bib |
.pdf ]
|
|
[eldr06]
|
Sameh Elnikety, Steven Dropsho, and Fernando Pedone.
Tashkent: Uniting durability with transaction ordering for
high-performance scalable database replication.
In Proc. EuroSys2006, April 2006.
[ bib |
.pdf |
.pdf ]
|
|
[load06]
|
Jacob R. Lorch, Atul Adya, William J. Bolosky, Ronnie Chaiken, John R. Douceur,
and Jon Howell.
The SMART way to migrate replicated stateful services.
In Proc. EuroSys2006, April 2006.
[ bib |
.pdf |
.pdf ]
|
|
[soam06]
|
Gokul Soundararajan, Cristiana Amza, and Ashvin Goel.
Database replication policies for dynamic content applications.
In Proc. EuroSys2006, April 2006.
[ bib |
.pdf |
.pdf ]
|
|
[befe06]
|
Philip A. Bernstein, Alan Fekete, Hongfei Guo, Raghu Ramakrishnan, and Pradeep
Tamma.
Relaxed-currency serializability for middle-tier caching and
replication.
In Proc. ACM SIGMOD international conference on Management of
data (SIGMOD'06), pages 599-610, 2006.
[ bib |
.pdf ]
|
|
[maai06]
|
Amit Manjhi, Anastassia Ailamaki, Bruce M. Maggs, Todd C. Mowry, Christopher
Olston, and Anthony Tomasic.
Simultaneous scalability and security for data-intensive web
applications.
In Proc. ACM SIGMOD International Conference on Management of
Data (SIGMOD'06), pages 241-252, 2006.
[ bib |
.pdf ]
|
|
[elpe05]
|
Sameh Elnikety, Willy Zwaenepoel, and Fernando Pedone.
Database replication using generalized snapshot isolation.
In IEEE Symposium on Reliable Distributed Systems (SRDS'05),
pages 73-84, October 2005.
[ bib |
.pdf ]
|
|
[amco05]
|
Cristiana Amza, Alan L. Cox, and Willy Zwaenepoel.
A comparative evaluation of transparent scaling techniques for
dynamic content servers.
In Proc. International Conference on Data Engineering
(ICDE'05), pages 230-241, 2005.
[ bib |
.pdf ]
Studies the impact of several scaling issues: scheduling and concurrency
control, load balancing, and query result caching.
|
|
[gula05]
|
Hongfei Guo, Per-Åke Larson, and Raghu Ramakrishnan.
Caching with 'good enough' currency, consistency, and completeness.
In Proc. International Conference on Very Large Data Bases
(VLDB'05), pages 457-468, 2005.
[ bib |
.pdf |
.pdf ]
|
|
[like05]
|
Yi Lin, Bettina Kemme, Marta Pati no Martínez, and Ricardo
Jiménez-Peris.
Middleware based data replication providing snapshot isolation.
In Proc. ACM SIGMOD International Conference on Management of
Data (SIGMOD'05), pages 419-430, 2005.
[ bib |
.pdf ]
|
|
[paji05]
|
Marta Pati no Martinez, Ricardo Jiménez-Peris, Bettina Kemme, and
Gustavo Alonso.
MIDDLE-R: Consistent database replication at the middleware level.
ACM Transactions on Computer Systems, 23(4):375-423, 2005.
[ bib |
DOI |
.pdf ]
|
|
[olma05]
|
Christopher Olston, Amit Manjhi, Charles Garrod, Anastassia Ailamaki, Bruce M.
Maggs, and Todd C. Mowry.
A scalability service for dynamic web applications.
In Proc. Second Biennial Conference on Innovative Data Systems
Research (CIDR'05), pages 56-69, January 2005.
[ bib |
.pdf |
.pdf ]
|
|
[sash05]
|
Yasushi Saito and Marc Shapiro.
Optimistic replication.
ACM Computing Surveys, 37(1):42-81, 2005.
[ bib |
.pdf ]
|
|
[cema04]
|
Emmanuel Cecchet, Julie Marguerite, and Willy Zwaenepoel.
C-JDBC: Flexible database clustering middleware.
In USENIX 2004 Annual Technical Conference, FREENIX Track,
pages 9-18, 2004.
[ bib |
.pdf |
.pdf ]
|
|
[gula04]
|
Hongfei Guo, Per-Åke Larson, Raghu Ramakrishnan, and Jonathan Goldstein.
Relaxed currency and consistency: How to say Good Enough in SQL.
In Proc. ACM SIGMOD international conference on Management of
data (SIGMOD'04), pages 815-826, 2004.
[ bib |
.pdf ]
|
|
[lago04]
|
Per-Åke Larson, Jonathan Goldstein, and Jingren Zhou.
MTCache: Transparent mid-tier database caching in SQL Server.
In Proc. International Conference on Data Engineering
(ICDE'04), pages 177-189, 2004.
[ bib |
.pdf ]
|
|
[plal04]
|
Christian Plattner and Gustavo Alonso.
Ganymed: Scalable replication for transactional web applications.
In ACM/IFIP/USENIX International Middleware Conference
(Middleware 2004), number 3231 in Lecture Notes in Computer Science, pages
155-174, 2004.
[ bib |
.pdf ]
|
|
[amco03]
|
Cristiana Amza, Alan L. Cox, and Willy Zwaenepoel.
Distributed versioning: Consistent replication for scaling back-end
databases of dynamic content web sites.
In ACM/IFIP/USENIX International Middleware Conference
(Middleware 2003), number 2672 in Lecture Notes in Computer Science, pages
282-304, 2003.
[ bib |
.pdf ]
Assumes predeclaration of access sets in the application. Goal is
one-copy serializability in a cluster environment, where individual
queries in a transaction may be routed to different servers in the
cluster. Transactions are serialized by scheduler based on the predeclared
access sets. Read-one, write-all is used to handle replicas. Version
numbers are used to track which updates have been applied to each
table, and it is assumed that the back-end DBMS will apply version-tagged
updates in the desired order.
|
|
[amco03b]
|
Cristiana Amza, Alan L. Cox, and Willy Zwaenepoel.
Conflict-aware scheduling for dynamic content applications.
In USENIX Symposium on Internet Technologies and Systems,
2003.
[ bib |
.pdf |
.pdf ]
|
|
[pepa02]
|
Ricardo Jiménez-Peris, Marta Pati no Martínez, Bettina Kemme, and
Gustavo Alonso.
Improving the scalability of fault-tolerant database clusters.
In Proc. International Conference on Distributed Computing
Systems (ICDCS'02), pages 477-484, 2002.
[ bib |
.pdf |
.pdf ]
Transactions are classified, and the database is partitioned into
conflict classes such that each transaction class uses a single conflict
class. The conflict classes need not be disjoint. Each conflict class
has a primary site. Replicas are updated lazily. Global serialization
order is determined by an atomic broadcast mechanism which is used
to deliver the transaction requests. This protocol needs to predict
conflicts between transactions.
|
|
[keal00]
|
Bettina Kemme and Gustavo Alonso.
Don't be lazy, be consistent: Postgres-R, a new way to implement
database replication.
In Proceedings of the International Conference on Very Large
Data Bases (VLDB'00), pages 134-143, 2000.
[ bib |
.pdf |
.pdf ]
Eager replication protocol that uses atomic broadcast to help serialize
transactions.
|
|
[brko99]
|
Yuri Breitbart, Raghavan Komondoor, Rajeev Rastogi, S. Seshadri, and Abraham
Silberschatz.
Update propagation protocols for replicated databases.
In Proc. ACM SIGMOD International Conference on Management of
Data (SIGMOD'99), pages 97-108, 1999.
[ bib |
.pdf ]
|
|
[anbr98]
|
Todd A. Anderson, Yuri Breitbart, Henry F. Korth, and Avishai Wool.
Replication, consistency, and practicality: Are these mutually
exclusive?
In Proc. ACM SIGMOD international conference on Management of
data (SIGMOD'98), pages 484-495, 1998.
[ bib |
.pdf ]
|
|
[brko97]
|
Yuri Breitbart and Henry F. Korth.
Replication and consistency: being lazy helps sometimes.
In Proc. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of
Database Systems (PODS'97), pages 173-184, 1997.
[ bib |
.pdf ]
|
|
[grhe96]
|
Jim Gray, Pat Helland, Patrick E. O'Neil, and Dennis Shasha.
The dangers of replication and a solution.
In Proc. ACM SIGMOD International Conference on Management of
Data (SIGMOD'96), pages 173-182, 1996.
[ bib |
.pdf ]
|
|
[daga85]
|
Susan B. Davidson, Hector Garcia-Molina, and Dale Skeen.
Consistency in a partitioned network: a survey.
ACM Computing Surveys, 17(3):341-370, 1985.
[ bib |
.pdf ]
A classic.
|
|
[cosi10]
|
Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell
Sears.
Benchmarking cloud serving systems with YCSB.
In Proc. ACM Symp. on Cloud Computing, June 2010.
[ bib |
.pdf ]
Benchmarks defines two so-called tiers: performance and scale-up.
The former considers latency and throughput as offered load increases
with a fixed amount of resources. The latter looks at traditional
scale-up (does performance stay flat as more data, offered load and
resources are added) and elastic speedup (does performance improve
if more resources are added under constant load). Benchmark is designed
to be extensible, but core workload consists of randomized inserts,
updates, reads and sequential scans of keyed records. Benchmark is
implemented as a multi-threaded Java program with an interface layer
used to customize interactions with specific data managers. Not clear
whether this is a closed-loop or open-loop client.
|
|
[daag10]
|
Sudipto Das, Divyakant Agrawal, and Amr El Abbadi.
G-Store: A scalable data store for transactional multi key access
in the cloud.
In Proc. ACM Symp. on Cloud Computing, June 2010.
[ bib |
.pdf ]
Argues that many web applications need atomic multi-key access. Allows
definition of transient, arbitrary key-groups, across which atomic
operations are possible. Key groups are implemented by transferring
ownership of all keys in a group to a single leader node in the underlying
storage system, so that it can coordinate atomic operations without
the need for a distributed coordination protocol. Leader uses write-ahead
logging to support failure recovery at the leader node. However,
it seems that while the leader is down, the group is unavailable.
|
|
[kili10]
|
Emre Kiciman, Benjamin Livshits, Madanlal Musuvathi, and Kevin C. Webb.
Fluxo: A system for internet service programming by non-expert
developers.
In Proc. ACM Symp. on Cloud Computing, June 2010.
[ bib |
.pdf ]
Restricted application programming model supporting common architectural
patterns for web services. Dataflow programming model with nodes
representing computation and edges representing data flow.
|
|
[alco10]
|
Peter Alvaro, Tyson Condie, Neil Conway, Khaled Elmeleegy, Joseph M .
Hellerstein, and Russell Sears.
BOOM analytics: Exploring data-centric, declarative programming for
the cloud.
In Proc. EuroSys Conference, April 2010.
[ bib |
.pdf |
.pdf ]
|
|
[voch10]
|
Hoang Tam Vo, Chun Chen, and Beng Chin Ooi.
Towards elastic transactional cloud storage with range query support.
In Proc. Int'l Conf. on Very Large Data Bases, 2010.
[ bib |
.pdf |
.pdf ]
|
|
[wuji10]
|
Sai Wu, Dawei Jiang, Ben Chin Ooi, and Kun-Lung Wu.
Efficient b+-tree based indexing for cloud data processing.
In Proc. Int'l Conf. on Very Large Data Bases, 2010.
[ bib |
.pdf |
.pdf ]
|
|
[tiiy09]
|
Omesh Tickoo, Ravi Iyer, Ramesh Illikkal, and Don Newell.
Modeling virtual machine performance: Challenges and approaches.
In Proc. Workshop on Hot Topics in Measurement and Modeling of
Computer Systems, June 2009.
[ bib |
.pdf |
.pdf ]
|
|
[kroe09]
|
Kirk L. Kroeker.
The evolution of virtualization.
Communications of the ACM, 52(3):18-20, March 2009.
[ bib ]
Tech-lite article talking about virtualization on hand-held devices,
about virtualization for software deployment, and about performance
and management.
|
|
[arfo09]
|
Michael Armbrust, Armando Fox, Rean Griffith, Anthony D. Joseph, Randy H. Katz,
Andrew Konwinski, Gunho Lee, David A. Patterson, Ariel Rabkin, Ion Stoica,
and Matei Zaharia.
Above the clouds: A Berkeley view of cloud computing.
Technical Report UCB/EECS-2009-28, University of California at
Berkeley, February 2009.
[ bib |
.pdf |
.pdf ]
|
|
[agsi09]
|
Parag Agrawal, Adam Silberstein, Brian F. Cooper, Utkarsh Srivastava, and Raghu
Ramakrishnan.
Asynchronous view maintenance for vlsd databases.
In Proc. ACM SIGMOD Int'l Conference on Management of Data
(SIGMOD'09), pages 179-192, 2009.
[ bib |
DOI |
.pdf ]
|
|
[krhe09]
|
Tim Kraska, Martin Hentschel, Gustavo Alonso, and Donald Kossmann.
Consistency rationing in the cloud: Pay only when it matters.
Proc. of the VLDB Endowment, 2(1):253-264, 2009.
[ bib |
.pdf |
.pdf ]
Proposes that data be assigned to one of three consistency levels:
A, B, or C. Data assigned to level C have only session consistency
and eventual consistency of updates. Data assigned to level A have
serializable consistency. Data in the B category have adaptive consistency,
switching between session consistency and serializability at runtime.
|
|
[lawh09]
|
Horacio Andrés Lagar-Cavilla, Joseph Andrew Whitney, Adin Matthew Scannell,
Philip Patchin, Stephen M. Rumble, Eyal de Lara, Michael Brudno, and Mahadev
Satyanarayanan.
Snowflock: Rapid virtual machine cloning for cloud computing.
In Proc. ACM European Conference on Computer Systems
(EuroSys'09), pages 1-12, 2009.
[ bib |
DOI |
.pdf ]
Snowflock implements an fork (clone) operation for running VMs. The
is no implicit synchronization or communication between parent and
clone after the fork - anything required must be coded explicitly.
Cloned children live on a virtual network with the parent, and can
only communicate within this network. SnowFlock starts clones with
little initial state, and additional state is shipped on demand from
the parent, which uses copy-on-write to preserve a snapshot of its
state as of the time of cloning. Each clone gets a virtual disk which
is a snapshot of the parent's as of the time of cloning. This is
implemented with using copy-on-write at the parent, which serves
pages to the clones (via blocktap) as necessary. This mechanism is
intended for the root device, not for I/O intensive data devices.
|
|
[hude08]
|
Wenjin Hu, Todd Deshane, and Jeanna Matthews.
Solaris virtualization options.
:login, 33(5):7-17, October 2008.
[ bib ]
Mostly a how-to guide for system admistrators, covering Containers,
Solaris xVM and Solaris xVM VirtualBox.
|
|
[chje08]
|
Ronnie Chaiken, Bob Jenkins, Paul Larson, Bill Ramsey, Darren Shakib, Simon
Weaver, and Jingren Zhou.
Scope: Easy and efficient parallel processing of massive data sets.
In Proc. Int'l Conference on Very Large Data Bases (VLDB'08),
2008.
[ bib |
.pdf ]
|
|
[cora08]
|
Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein,
Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver, and Ramana
Yerneni.
PNUTS: Yahoo!'s hosted data serving platform.
Proc. of the VLDB Endowment, 1(2):1277-1288, 2008.
[ bib |
DOI |
.pdf ]
|
|
[cule08]
|
Brendan Cully, Geoffrey Lefebvre, Dutch T. Meyer, Mike Feeley, Norman C.
Hutchinson, and Andrew Warfield.
Remus: High availability via asynchronous virtual machine
replication.
In Proc. USENIX Symposium on Networked Systems Design and
Implementation (NSDI), page 161, 2008.
[ bib |
.pdf |
.pdf ]
|
|
[degh08]
|
Jeffrey Dean and Sanjay Ghemawat.
Mapreduce: Simplified data processing on large clusters.
Communications of the ACM, 51(1):107-113, 2008.
[ bib |
DOI ]
|
|
[minh08]
|
Umar Farooq Minhas.
A performance evaluation of database systems on virtual machines.
Technical Report CS-2008-01, David R. Cheriton School of Computer
Science, University of Waterloo, January 2008.
Masters thesis.
[ bib |
.pdf |
.pdf ]
|
|
[olre08]
|
Christopher Olston, Benjamin Reed, Utkarsh Srivastava, Ravi Kumar, and Andrew
Tomkins.
Pig latin: A not-so-foreign language for data processing.
In Proc. ACM SIGMOD Int'l Conference on Management of Data,
pages 1099-1110, 2008.
[ bib |
.pdf ]
|
|
[sico08]
|
Adam Silberstein, Brian F. Cooper, Utkarsh Srivastava, Erik Vee, Ramana
Yerneni, and Raghu Ramakrishnan.
Efficient bulk insertion into a distributed ordered table.
In Proc. ACM Int'l Conference on Management of Data
(SIGMOD'08), pages 765-778, 2008.
[ bib |
http |
.pdf ]
|
|
[shde07]
|
Piyush Shivam, Azbayar Demberel, Pradeep Gunda, David E. Irwin, Laura E. Grit,
Aydan R. Yumerefendi, Shivnath Babu, and Jeffrey S. Chase.
Automated and on-demand provisioning of virtual machines for database
applications.
In Proc. ACM SIGMOD International Conference on Management of
Data (SIGMOD'07), pages 1079-1081, June 2007.
[ bib |
DOI |
.pdf ]
demo paper
|
|
[sopo07]
|
Stephen Soltesz, Herbert Potzl, Marc Fiuczynski, Andy Bavier, and Larry
Peterson.
Container-based operating system virtualization: A scalable
high-performance alternative to hypervisors.
In Proc. EuroSys 2007, pages 275-288, March 2007.
[ bib |
.pdf ]
|
|
[deha07]
|
Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, and
Avinash Lakshman.
Dynamo: Amazon's highly available key-value store.
In Proc. ACM Symposium on Operating Systems Principles
(SOSP'07), pages 205-220, 2007.
[ bib |
DOI |
.pdf ]
|
|
[isbu07]
|
Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, and Dennis Fetterly.
Dryad: distributed data-parallel programs from sequential building
blocks.
In Proc. EuroSys Conference, pages 59-72, 2007.
[ bib |
.pdf ]
|
|
[pazh07]
|
Pradeep Padala, Xiaoyun Zhu, Zhikui Wang, Sharad Singhal, and Kang G. Shin.
Performance evaluation of virtualization technologies for server
consolidation.
Technical Report HPL-2007-59, HP Laboratories Palo Alto, 2007.
[ bib |
.pdf |
.pdf ]
Compares Xen, OpenVZ, and base Linux configurations. Looks at two-tier
(Apache+PHP and MySQL) system under a RUBiS workload. Considers a
variety of configurations: both tiers on a single physical node,
each tier on a different node, and multiple application stacks with
the web tiers on one node and the database tiers on another node.
Found higher CPU overhead in the Xen configuration, relative to OpenVZ
and base Linux. Found that Xen DomU had much higher L2 cache miss
count than the base Linux system, but is it not clear how much of
this is from the kernel in DomU and how much is from the application.
|
|
[chde06]
|
Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach,
Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber.
Bigtable: a distributed storage system for structured data.
In Proc. USENIX Symposium on Operating System Design and
Implementation (OSDI'06), 2006.
[ bib |
.pdf ]
|
|
[guch06]
|
D. Gupta, L. Cherkasova, R. Gardner, and A. Vahdat.
Enforcing performance isolation across virtual machines in xen.
In Proc. of the ACM/IFIP/USENIX 7th International Middleware
Conference, 2006.
[ bib |
.pdf |
.pdf ]
|
|
[irch06]
|
David E. Irwin, Jeffrey S. Chase, Laura E. Grit, Aydan R. Yumerefendi, David
Becker, and Ken Yocum.
Sharing networked resources with brokered leases.
In Proc. USENIX Technical Conference, pages 199-212, 2006.
[ bib |
.pdf |
.pdf ]
Resource providers make resources available to brokers, which in
turn use them to satisy requests
from clients. Clients get lease tickets from brokers, which understand
which resources are available
from which providers, and which implement polcies controlling which
clients get which resources.
Clients can redeem tickets with resource providers to obtain the lease,
which gives the client
access to resources for a fixed time window. Shirako is a toolkit
to facilitate the constrution of clients, brokers,
and resource providers.
|
|
[khbe06]
|
G. Khanna, K. Beaty, G. Kar, and A. Kochut.
Application performance management in virtualized server
environments.
In Proc. IEEE/IFIP Network Operations and Management Symposium,
pages 373-381, 2006.
[ bib |
.pdf ]
|
|
[rair06]
|
Lavanya Ramakrishnan, David E. Irwin, Laura E. Grit, Aydan R. Yumerefendi,
Adriana Iamnitchi, and Jeffrey S. Chase:.
Toward a doctrine of containment: Grid hosting with adaptive resource
control.
In Proc. ACM/IEEE Conference on High Performance Networking and
Computing (SC2006), 2006.
[ bib |
DOI |
.pdf ]
|
|
[clfr05]
|
Christopher Clark, Keir Fraser, Steven Hand, Jacob Gorm Hansen, Eric Jul,
Christian Limpach, Ian Pratt, and Andrew Warfield.
Live migration of virtual machines.
In Proc. Symposium on Networked Systems Design and
Implementation (NSDI 2005), May 2005.
[ bib |
.pdf |
.pdf ]
|
|
[fotu05]
|
Ian Foster and Steven Tuecke.
Describing the elephant: The different faces of IT as service.
Queue, 3(6):26-29, 2005.
[ bib |
.pdf ]
|
|
[pido05]
|
Rob Pike, Sean Dorward, Robert Griesemer, and Sean Quinlan.
Interpreting the data: Parallel analysis with sawzall.
Scientific Programming, 13(4):277-298, 2005.
[ bib |
.pdf ]
|
|
[rose05]
|
Mendel Rosenblum.
The reincarnation of virtual machines.
Queue, 2(5):34-40, 2005.
[ bib |
.pdf ]
|
|
[roga05]
|
Mendel Rosenblum and Tal Garfinkel.
Virtual machine monitors: Current technology and future trends.
IEEE Computer, 38(5):39-47, 2005.
[ bib |
.pdf ]
|
|
[smna05]
|
James E. Smith and Ravi Nair.
The architecture of virtual machines.
IEEE Computer, 38(5):32-38, 2005.
[ bib |
.pdf ]
|
|
[waha05]
|
Andrew Warfield, Steven Hand, Keir Fraser, and Tim Deegan.
Facilitating the development of soft devices.
In Proc. USENIX Annual Technical Conference, pages 379-382,
2005.
[ bib |
.pdf |
.pdf ]
|
|
[wimo04]
|
John Wilkes, Jeffrey Mogul, and Jaap Suermondt.
Utilification.
In Proceedings of the 11th ACM SIGOPS European Workshop,
September 2004.
[ bib |
.pdf |
.pdf ]
Discusses the process of preparing software applications and application
stacks for execution in a utility computing environment.
|
|
[dahe04]
|
Shaul Dar, Gil Hecht, and Eden Shochat.
dbswitch: Towards a database utility.
In Proc. ACM SIGMOD International Conference on Management of
Data (SIGMOD'04), pages 892-896, 2004.
[ bib |
.pdf ]
|
|
[degh04]
|
Jeffrey Dean and Sanjay Ghemawat.
Mapreduce: Simplified data processing on large clusters.
In Proc. Symposium on Operating Systems Design and
Implementation (OSDI'04), pages 137-150, 2004.
[ bib |
.pdf ]
Proposes a programming model for highly parallelizable computations,
and describes a system that implements this model. The computation
input is a set of input key/value pairs, and the output is a set
of output key/value pairs. The computation itself is defined by two
functions. A Map function takes an input key value pair and produces
a set of intermediate key/value pairs. A Reduce function takes an
intermediate key and a set of values, and produces a single value.
|
|
[hupe04]
|
Lan Huang, Gang Peng, and Tzi cker Chiueh.
Multi-dimensional storage virtualization.
In Proc. Joint International Conference on Measurement and
Modeling of Computer Systems, pages 14-24, 2004.
[ bib |
.pdf ]
|
|
[krga04]
|
Ivan Krsul, Arijit Ganguly, Jian Zhang, José A. B. Fortes, and Renato J. O.
Figueiredo.
VMPlants: Providing and managing virtual machine execution
environments for grid computing.
In Proc. ACM/IEEE Conference on High Performance Networking and
Computing (SC2004), 2004.
[ bib |
DOI |
.pdf ]
|
|
[chgo03]
|
A. Chandra, P. Goyal, and P. Shenoy.
Quantifying the benefits of resource multiplexing in on-demand data
centers.
In Proc. First Workshop on Algorithms and Architectures for
Self-Managing Systems, June 2003.
[ bib |
.pdf ]
|
|
[badr03]
|
Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho,
Rolf Neugebauer, Ian Pratt, and Andrew Warfield.
Xen and the art of virtualization.
In Proceedings of the Nineteenth ACM Symposium on Operating
Systems Principles (SOSP'03), pages 164-177. ACM Press, 2003.
[ bib |
.pdf ]
Very nice paper describing the hardware virtualization approach used
by Xen and changes it necessitates in the OS. Also includes some
empirical performance evaluation.
|
|
[maei03]
|
Susan Malaika, Andrew Eisenberg, and Jim Melton.
Standards for databases on the grid.
SIGMOD Record, 32(3), 2003.
[ bib |
.pdf ]
An overview of some data-related parts of the grid standardization
process, including OGSA, DAIS (Data Access and Integration) for standarizing
access to relational and XML data sources, OREP (OGSA Replication
Services), and DFDL (Data Format and Description Language).
|
|
[anar02]
|
Artur Andrzejak, Martin Arlitt, and Jerry Rolia.
Bounding the resource savings of utility computing models.
Technical Report HPL-2002-339, HP Laboratories, 2002.
[ bib |
.pdf |
.pdf ]
|
|
[foke02]
|
I. Foster, C. Kesselman, J. Nick, and S. Tuecke.
Grid services for distributed system integration.
Computer, 35(6), 2002.
[ bib |
.pdf |
.pdf ]
Extended version can be found at http://www.globus.org/research/papers/ogsa.pdf.
This is an overview of the Open Grid Services Architecture (OGSA),
which is defines something very much like a distributed object system.
|
|
[rozh02]
|
Jerry Rolia, Xiaoyun Zhu, Martin Arlitt, and Artur Andrzejak.
Statistical service assurances for applications in utility grid
environments.
In IEEE International Symposium on Modeling, Analysis, and
Simulation of Computer and Telecommunications Systems (MASCOTS'02), pages
247-256, 2002.
[ bib |
.pdf ]
|
|
[sach02]
|
Constantine P. Sapuntzakis, Ramesh Chandra, Ben Pfaff, Jim Chow, Monica S. Lam,
and Mendel Rosenblum.
Optimizing the migration of virtual computers.
In Proc. Symposium on Operating System Design and Implementation
(OSDI'02), 2002.
[ bib |
.pdf ]
|
|
[chfo01]
|
A. Chervenak, I. Foster, C. Kesselman, C. Salisbury, and S. Tuecke.
The data grid: Towards an architecture for the distributed management
and analysis of large scientific datasets.
Journal of Network and Computer Applications, 23:187-200,
2001.
[ bib |
.pdf |
.pdf ]
Defines the core services of a data grid as a file-oriented storage
service plus a distributed directory for meta-data. Also has some
discussion of higher level services, like replication.
|
|
[gupu11]
|
Jorge Guerra, Himabindu Pucha, Joseph S. Glider, Wendy Belluomini, and Raju
Rangaswami.
Cost effective storage using extent based dynamic tiering.
In USENIX Conference on File and Storage Technologies, pages
273-286, February 2011.
[ bib |
.pdf |
.pdf ]
Includes a configuration advisor and a run-time tiering mechanism.
The advisor uses a storage workload trace to estimate the capacity
required in each tier during each time period, assuming that the
run-time mechanism is moving each extent to the lowest cost tier
that can satisfy the extent's I/O requirements during each epoch.
The advisor recommends provisioning according to the maximum demand
at each tier over all epochs. Epochs are assumed to be minutes/hours
in duration. At run-time, a dynamic tier manager adjusts the placement
of extents after each epoch. It choose a tier for each extent that
will minimize power consumption, amont tiers that can satisfy the
performance extent's performance demands. Within a tier, it also
assigns extents to specific devices, attempting to consolidate so
that devices can be powered down. Necessary migrations are then scheduled
to run gradually.
|
|
[babo09]
|
Shivnath Babu, Nedyalko Borisov, Sandeep Uttamchandani, Ramani Routray, and
Aameek Singh.
DIADS: Addressing the "my-problem-or-yours" syndrome with
integrated san and database diagnosis.
In Proc. USENIX Conference on File and Storage Technologies
(FAST'09), 2009.
[ bib |
.pdf |
.pdf ]
Describes a system that uses database query execution plans, storage
layout and storage system configuration to attempt to pinpoint the
cause of performance problems (e.g., slow queries) in a DBMS plus
storage system stack.
|
|
[guah09]
|
Ajay Gulati, Irfan Ahmad, and Carl A. Waldspurger.
PARDA: Proportional allocation of resources for distributed storage
access.
In Proc. USENIX Conference on File and Storage Technologies
(FAST'09), 2009.
[ bib |
.pdf |
.pdf ]
Mechanism for proportional allocation of storage server bandwidth
among multiple storage clients. Each client observes server response
times to detect overload conditions, and then throttles it request
stream by an amount determined by the share it expects to receive.
|
|
[solu09]
|
Gokul Soundararajan, Daniel Lupei, Saeed Ghanbari, Adrian Daniel Popescu, Jin
Chen, and Cristiana Amza.
Dynamic resource allocation for database servers running on virtual
storage.
In Proc. USENIX Conference on File and Storage Technologies
(FAST'09), 2009.
[ bib |
.pdf ]
Considers how to apportion database and storage server buffer cache
space and storage system bandwidth across multiple workloads.
|
|
[guah08]
|
Ajay Gulati and Irfan Ahmad.
Towards distributed storage resource management using flow control.
In Int'l Workshop on Storage and I/O Virtualization,
Performance, Energy, Evaluation and Dependability (SPEED'08), February 2008.
[ bib |
.pdf |
.pdf ]
|
|
[somi08]
|
Gokul Soundararajan, Madalin Mihailescu, and Cristiana Amza.
Context-aware prefetching at the storage server.
In Proc. USENIX Annual Technical Conference, pages 377-390,
2008.
[ bib |
.pdf |
.pdf ]
|
|
[qiiy06]
|
Lin Qiao, Balakrishna R. Iyer, Divyakant Agrawal, and Amr El Abbadi.
Automated storage management with qos guarantee in large-scale
virtualized storage systems.
Bulletin of the IEEE Technical Committee on Data Engineering,
29(3):47-54, September 2006.
[ bib |
.ps |
.pdf ]
|
|
[qiag06]
|
Lin Qiao, Divyakant Agrawal, Amr El Abbadi, and Balakrishna R. Iyer.
Pulsatingstore: An analytic framework for automated storage
management.
In Proc. International Conference on Data Engineering Workshops,
Workshop on Self-Managing Database Systems (SMDB'06), page 1213, 2006.
[ bib |
.pdf ]
|
|
[qiiy06b]
|
Lin Qiao, Balakrishna R. Iyer, Divyakant Agrawal, and Amr El Abbadi.
Automated storage management with qos guarantees.
In Proc. International Conference on Data Engineering
(ICDE'06), page 150, 2006.
[ bib |
.pdf ]
|
|
[mang05]
|
Radhakrishnan Manga.
Database layout with Data ONTAP.
Technical Report TR-3411, Network Applicance Corp., September 2005.
[ bib |
.pdf |
.pdf ]
|
|
[ansp05]
|
Eric Anderson, Susan Spence, Ram Swaminathan, Mahesh Kallahalla, and Qian Wang.
Quickly finding near-optimal storage designs.
ACM Transactions on Computer Systems, 23(4):337-374, 2005.
[ bib |
DOI |
.pdf ]
|
|
[liiy05]
|
Lin Qiao, Balakrishna R Iyer, Divyakant Agrawal, and Amr El Abbadi.
SVL: Storage virtualization engine leveraging DBMS technology.
In Proceedings of the 21st International Conference on Data
Engineering (ICDE'05), pages 1048-1059, 2005.
[ bib |
.pdf ]
|
|
[qiiy05]
|
Lin Qiao, Balakrishna R. Iyer, Divyakant Agrawal, Amr El Abbadi, and Sandeep
Uttamchandani.
PULSTORE: Automated storage management with QoS guarantee.
In Proc. International Conference on Autonomic Computing
(ICAC'05), pages 302-303, 2005.
[ bib |
.pdf ]
|
|
[qiiy05b]
|
Lin Qiao, Balakrishna R Iyer, Divyakant Agrawal, Amr El Abbadi, and Sandeep
Uttamchandani.
PULSTORE: Automated storage management with QoS guarantee in
large-scale virtualized storage systems.
This is a longer unpublished version of the ICAC'05 publication
[qiiy05]., 2005.
[ bib |
.pdf ]
|
|
[utyi05]
|
Sandeep Uttamchandani, Li Yin, Guillermo A. Alvarez, John Palmer, and Gul Agha.
CHAMELEON: A self-evolving, fully-adaptive resource arbitrator for
storage systems.
In Proc. USENIX 2005 Annual Technical Conference, pages 75-88,
2005.
[ bib |
.pdf |
.pdf ]
|
|
[wech04]
|
Wei Jin, Jeffrey S. Chase, and Jasleen Kaur.
Interposed proportional sharing for a storage service utility.
In Proc. International Conference on Measurements and Modeling
of Computer Systems (SIGMETRICS'04), pages 37-48, June 2004.
[ bib |
.pdf ]
|
|
[dech03]
|
Murthy Devarakonda, David Chess, Ian Whalley, Alla Segal, Pawan Goyal, Aamer
Sachedina, Keri Romanufa, Ed Lassettre, William Tetzlaff, and Bill Arnold.
Policy-based autonomic storage allocation.
In Proc. 14th IFIP/IEEE International Workshop on Distributed
Systems: Operations and Management (DSOM), number 2867 in Lecture Notes in
Computer Science, pages 143-154. Springer-Verlag, 2003.
[ bib |
.pdf ]
|
|
[goja03]
|
Pawan Goyal, Divyesh Jadav, Dharmendra S. Modha, and Renu Tewari.
CacheCOW: QoS for storage system caches.
In Eleventh International Workshop on Quality of Service (IWQoS
03), 2003.
[ bib |
CiteSeer |
.pdf ]
Allocation of buffer space in the face of a multi-class workload
with QoS (mean response time) requirements for each class. Dynamic
algorithms.
|
|
[lume03]
|
Christopher Lumb, Arif Merchant, and Guillermo Alvarez.
Facade: virtual storage devices with performance guarantees.
In Proceedings of the 2nd USENIX Conference on File and
Storage Technologies, pages 131-144, 2003.
[ bib |
CiteSeer |
.pdf ]
Enforcement of SLOs for disk. SLOs are load/response time curves
for reads and writes. Enforcement mechanism throttles requests from
hosts to storage devices. Assumes offered loads are feasible for
the underlying storage devices. Uses real-time scheduling (EDF) to
put requests into device queues so that deadlines targets are met.
Device queue lengths are managed with feedback control.
|
|
[anho02]
|
Eric Anderson, Michael Hobbs, Kimberly Keeton, Susan Spence, Mustafa Uysal, and
Alistair Veitch.
Hippodrome: running circles around storage administration.
In Conference on File and Storage Technology (FAST'02), pages
175-188, January 2002.
[ bib |
.pdf |
.pdf ]
Given workload, configure block-level storage system. Workload is
described as stores (logically contiguous set of blocks) and streams
(details in [veke01]). Workload analysis tool can produce stream-based
workload description from a request trace. A configuration consists
of number of disks, grouping of disks into arrays, division of arrays
into logical units, disk controller and cache settings, and a mapping
of the stores in the workload onto the logical units. Includes a
migration component to move the storage system between configurations.
Details of configuration finder are in [anka01]. Iterative approach
does not assume that workload remains constant as the system configuration
changes.
|
|
[shvi02]
|
Prashant J. Shenoy and Harrick M. Vin.
Cello: A disk scheduling framework for next generation operating
systems.
Real-Time Systems, 22(1-2):9-48, 2002.
[ bib ]
|
|
[waos02]
|
Julie Ward, Michael O'Sullivan, Troy Shahoumian, and John Wilkes.
Appia: automatic storage area network design.
In Conference on File and Storage Technology (FAST'02), pages
203-217, January 2002.
[ bib |
.pdf |
.pdf ]
|
|
[anka01]
|
E. Anderson, M. Kallahalla, S. Spence, R. Swaminathan, and Q. Wang.
Ergastulum: quickly finding near-optimal storage system designs.
Technical Report HPL-SSP-2001-5, HP Laboratories, July 2001.
[ bib |
.pdf |
.pdf ]
Input includes a workload description in terms of, logical stores
and streams of accesses to the stores, a description of the available
physical devices, a set of constraint on how those devices may be
configured, maximum utilizations, or total system cost, and finally
a cost function that can be used to compare alternative configurations.
The output includes a grouping of devices into RAID logical units
(LUs) and settings for configuration parameters, such as stripe sizes.
Search through the design space is heuristic. Cost functions are
externally defined, so the system cannot exploit their structure
to improve the search.
|
|
[wilk01]
|
John Wilkes.
Traveling to Rome: QoS specifications for automated storage
system management.
In Proc. Intl. Workshop on Quality of Service (IWQoS'2001),
number 2092 in Lecture Notes in Computer Science, pages 75-91.
Springer-Verlag, June 2001.
[ bib |
CiteSeer |
.pdf ]
Provides an historical overview of an HP effort in automated storage
system management. Provides examples of specification language used
to describe workloads and system configurations.
|
|
[albo01]
|
Guillermo A. Alvarez, Elizabeth Borowsky, Susie Go, Theodore H. Romer, Ralph
Becker-Szendy, Richard Golding, Arif Merchant, Mirjana Spasojevic, Alistair
Veitch, and John Wilkes.
Minerva: An automated resource provisioning tool for large-scale
storage systems.
ACM Transactions on Computer Systems, 19(4):483-518, 2001.
[ bib |
.ps.Z |
.ps.Z ]
|
|
[brbr99]
|
John L. Bruno, Jose Carlos Brustoloni, Eran Gabber, Banu Ozden, and Abraham
Silberschatz.
Disk scheduling with quality of service guarantees.
In IEEE International Conference on Multimedia Computing and
Systems (ICMCS 1999), Vol. 2, pages 400-405, 1999.
[ bib |
CiteSeer |
.pdf ]
|
|
[pore11]
|
Raluca Ada Popa, Catherine Redfield, Nickolai Zeldovich, and Hari Balakrishnan.
CryptDB: Protecting confidentiality with encrypted query
processing.
In Symposium on Operating Systems Principles, October 2011.
[ bib |
.pdf |
.pdf ]
|
|
[kovi11]
|
Ioannis Koltsidas and Stratis D. Viglas.
Data management over flash memory (tutorial presentation).
In Proc. ACM SIGMOD Int'l Conf. on Management of Data, June
2011.
[ bib |
.pdf ]
|
|
[mrys11]
|
Michael Rys.
Scalable SQL.
Communications of the ACM, 54(6):48-53, June 2011.
[ bib ]
Discusses data and functional partitioning in fairly generic terms.
Also includes a case study of scaleout for MySpace, using SQL Server.
|
|
[shmi11]
|
Mohammad Bilal Sheikh, Umar Farooq Minhas, Omar Zia Khan, Ashraf Aboulnaga,
Pascal Poupart, and David J. Taylor.
A bayesian approach to online performance modeling for database
appliances using gaussian models.
In Proc. Int'l Conf. on Autonomic Computing, June 2011.
[ bib |
.pdf ]
|
|
[bihu11]
|
Kenneth P. Birman, Qi Huang, and Dan Freedman.
Overcoming the D in CAP: Using Isis2 to build locally
responsive cloud services.
Technical report, Cornell University, April 2011.
unnumbered technical report.
[ bib |
.pdf |
.pdf ]
|
|
[lajo11]
|
Horacio Lagar, Kaustubh Joshi, Matti Hiltunen, Roy Bryant, Eyal de Lara, Alexey
Tumanov, Olga Irzak, and Adin Scannell.
Kaleidoscope: Cloud micro-elasticity via vm state coloring.
In Proc. EuroSys Conference, April 2011.
[ bib |
.pdf |
.pdf ]
|
|
[babo11]
|
Jason Baker, Chris Bond, James Corbett, J.J. Furman, Andrey Khorlin, James
Larson, Jean-Michel Leon, Yawei Li, Alexander Lloyd, and Vadim Yushprakh.
Megastore: Providing scalable, highly available storage for
interactive services.
In Conference on Innovative Database Research, January 2011.
[ bib |
.pdf |
.pdf ]
|
|
[becs11]
|
Philip A. Bernstein, Istvan Cseri, Nishant Dani, Nigel Ellis, Ajay Kallan,
Gopal Kakivaya, David B. Lomet, Ramesh Manne, Lev Novik, and Tomas Talius.
Adapting Microsoft SQL Server for cloud computing.
In Proc. IEEE Int'l Conf. on Data Engineering, 2011.
[ bib |
http ]
Uses partitioning, transactions confined to a single partition. Replication
for HA. Storage engine for Azure.
|
|
[cujo11]
|
Carlo Curino, Evan Jones, Raluca Ada Popa, Nirmesh Malviya, Eugene Wu, Samuel
Madden, Hari Balakrishnan, and Nickolai Zeldovich.
Relational Cloud: A database service for the cloud.
In Conference on Innovative Database Research, January 2011.
[ bib |
.pdf |
.pdf ]
Multiple multi-tenant DBMS, each hosting one or more workloads. Large
workloads can be scaled-out over multiple DBMS using workload-aware
partitioning.
|
|
[hewi11]
|
Eben Hewitt.
Cassandra: The Definitive Guide.
O'Reilly, 2011.
[ bib |
.pdf ]
|
|
[nawi11]
|
Mahdi Tayarani Najaran, Primal Wijesekera, Andrew Warfield, and Norman C.
Hutchinson.
Distributed indexing and locking: In search of scalable consistency.
In Proc. Workshop on Large Scale Distributed Systems and
Middleware, 2011.
[ bib |
.pdf ]
|
|
[wepi11]
|
Zhou Wei, Guillaume Pierre, and Chi-Hung Chi.
CloudTPS: Scalable transactions for Web applications in the
cloud.
IEEE Transactions on Services Computing, 2011.
[ bib |
.pdf |
.pdf ]
|
|
[catt10]
|
Rick Cattell.
Scalable SQL and NoSQL data stores.
SIGMOD Record, 39(4):12-27, December 2010.
[ bib |
.pdf |
.pdf ]
|
|
[adbo10]
|
Sarita V. Adve and Hans-J. Boehm.
Memory models: A case for rethinking parallel languages and hardware.
Communications of the ACM, 53(8):90-101, August 2010.
[ bib ]
Excellent overview of hardware and high-level language memory models.
|
|
[stuh10]
|
Julian Stuhler.
Ibm db2 purescale: The next big thing or a solution looking for a
problem?
Database Journal, July 2010.
[ bib |
http ]
|
|
[brho10]
|
Erik Brynjolfsson, Paul Hofmann, and John Jordan.
Cloud computing and electricity: Beyond the utility model.
Communications of the ACM, 53(5):32-34, May 2010.
[ bib |
.pdf ]
Discussion of technical and business strengths and weaknesses of
the utility computing model, including security, lock-in and interoperability.
|
|
[durk10]
|
Dave Durkee.
Why cloud computing will never be free.
Communications of the ACM, 53(5):62-69, May 2010.
[ bib ]
Discusses cloud service pricing, the cloud computing marketplace,
and strategies the may be used by vendors to keep costs low, and
weaknesses of current cloud SLAs. Then discusses requirements for
Cloud 2.0, meaning cloud services intended to support critical enterprise
applications. Issues include storage system performance - argues
that access randomness and working set size are proportional to the
number of applications supported by a shared storage service. Also
discusses administration, SLAs and automation.
|
|
[cami10]
|
Mustafa Canim, George A. Mihaila, Bishwaranjan Bhattacharjee, Kenneth A. Ross,
and Christian A. Lang.
Ssd bufferpool extensions for database systems.
Proc. of the VLDB Endowment, 3(2):1435-1446, 2010.
[ bib |
.pdf |
.pdf ]
|
|
[daag10a]
|
Sudipto Das, Shashank Agarwal, Divyakant Agrawal, and Amr El Abbadi.
Elastras: An elastic, scalable, and self managing transactional
database for the cloud.
Technical Report 2010-04, University of California, Santa Barbara,
2010.
[ bib |
.pdf ]
|
|
[dani10]
|
Sudipto Das, Shoji Nishimura, Divyakant Agrawal, and Amr El Abbadi.
Live database migration for elasticity in a multitenant database for
cloud platforms.
Technical Report 2010-09, Department of Computer Science, University
of California Santa Barbara, 2010.
[ bib |
.pdf ]
|
|
[dese10]
|
Biplob Debnath, Sudipta Sengupta, and Jin Li.
Flashstore: High throughput persistent key-value store.
Proc. of the VLDB Endowment, 3(2):1414-1425, 2010.
[ bib |
.pdf |
.pdf ]
Writes collected in RAM and batched to SSD in chunks large enough
to fill a flash page. Hash table in memory is used to index key,value
pairs in the SSD. There is also a read cache in RAM. Berkeley DB
is used to index key,value records on disk. Record read checks RAM
read cache, then RAM write buffer, then SSD, then disk. All reads
are added to the RAM read cache. Records are inserted into the SSD
when they are written (after staging). SSD pages are organized as
a ring buffer. When SSD fills, records on early pages are recycled
- either by reinserting them into the SSD or by destaging them to
the disk. A clock like algorithm (with recent-reference bit) is used
to determine whether a record is reinserted into SSD or destaged
to disk.
|
|
[guku10]
|
Ajay Gulati, Chethan Kumar, Irfan Ahmad, and Karan Kumar.
Basil: Automated io load balancing across storage devices.
In USENIX Conference on File and Storage Technology (FAST'10),
2010.
[ bib |
.pdf ]
|
|
[jobo10]
|
William K. Josephson, Lars A. Bongo, David Flynn, and Kai Li.
Dfs: A file system for virtualized flash storage.
In USENIX Conference on File and Storage Technology (FAST'10),
2010.
[ bib |
.pdf ]
|
|
[leig10]
|
Tom Leighton.
Akamai and cloud computing: A perspective from the edge of the cloud.
Akamai white paper, 2010.
[ bib |
.pdf ]
|
|
[peda10]
|
Daniel Peng and Frank Dabek.
Large-scale incremental processing using distributed transactions and
notifications.
In USENIX Conference on Operating Systems Design and
Implementation, pages 1-15, 2010.
[ bib |
.pdf |
.pdf ]
Describes Percolator, used to incrementally maintain Google's web
search index. Provides multi-row transactions and snapshot isolation,
using multi-versioning in BigTable. Some transactions may have high
latency.
|
|
[lama09]
|
Avinash Lakshman and Prashant Malik.
Cassandra - a decentralized structured storage system.
In Proc. ACM SIGOPS Int'l Workshop on Large Scale Distributed
Systems and Middleware (LADIS'09), October 2009.
[ bib |
.pdf |
.pdf ]
|
|
[pure09]
|
Transparent application scaling with ibm db2 purescale.
IBM white paper, October 2009.
[ bib |
.pdf |
.pdf ]
|
|
[wure09]
|
Xiaojian Wu and A. L. Narasimha Reddy.
Managing storage space in a flash and disk hybrid storage system.
In Proc. IEEE/ACM Int'l Symp. on Modelling, Analysis and
Simulation of Computer and Telecommunication Systems (MASCOTS), September
2009.
[ bib |
.pdf |
.pdf ]
|
|
[coco09]
|
Greenplum.
Mad skills: New analysis practices for big data.
Greenplum white paper, March 2009.
[ bib |
.pdf |
.pdf ]
|
|
[abba09]
|
Azza Abouzeid, Kamil Bajda-Pawlikowski, Daniel J. Abadi, Alexander Rasin, and
Avi Silberschatz.
Hadoopdb: An architectural hybrid of mapreduce and dbms technologies
for analytical workloads.
Proc. of the VLDB Endowment, 2(1):922-933, 2009.
[ bib |
.pdf |
.pdf ]
|
|
[auja09]
|
Stefan Aulbach, Dean Jacobs, Alfons Kemper, and Michael Seibold.
A comparison of flexible schemas for software as a service.
In Proc. ACM SIGMOD Int'l Conference on Management of Data,
pages 881-888, 2009.
[ bib |
DOI |
.pdf ]
|
|
[cabh09]
|
Mustafa Canim, Bishwaranjan Bhattacharjee, George Mihaila, Christian Lang, and
Ken Ross.
An object placement advisor for db2 using solid state storage.
Proc. of the VLDB Endowment, 2(2):1318-1329, 2009.
[ bib |
.pdf |
.pdf ]
|
|
[daag09]
|
Sudipto Das, Divyakant Agrawal, and Amr El Abbadi.
ElasTraS: An elastic transactional data store in the cloud.
In Proc. USENIX Workshop on Hot Topics in Cloud Computing,
2009.
[ bib |
.pdf |
.pdf ]
|
|
[frpa09]
|
Eric Friedman, Peter M. Pawlowski, and John Cieslewicz.
Sql/mapreduce: A practical approach to self-describing, polymorphic,
and parallelizable user-defined functions.
Proc. of the VLDB Endowment, 2(2):1402-1413, 2009.
[ bib |
.pdf |
.pdf ]
|
|
[gare09]
|
John Garrison and A. L. Narasimha Reddy.
Umbrella file system: Storage management across heterogeneous
devices.
ACM Transactions on Storage, 5(1), 2009.
[ bib |
DOI |
.pdf |
.pdf ]
|
|
[gana09]
|
Alan Gates, Olga Natkovich, Shubham Chopra, Pradeep Kamath, Shravan Narayanam,
Christopher Olston, Benjamin Reed, Santhosh Srinivasan, and Utkarsh
Srivastava.
Building a highlevel dataflow system on top of mapreduce: The pig
experience.
Proc. of the VLDB Endowment, 2(2):1414-1425, 2009.
[ bib |
.pdf |
.pdf ]
|
|
[isyu09]
|
Michael Isard and Yuan Yu.
Distributed data-parallel computing using a high-level programming
language.
In Proc. ACM SIGMOD Int'l Conf. on Management of Data
(SIGMOD'09), pages 987-994, 2009.
[ bib |
DOI |
.pdf ]
|
|
[papa09]
|
Andrew Pavlo, Erik Paulson, Alexander Rasin, Daniel J. Abadi, David J. DeWitt,
Samuel Madden, and Michael Stonebraker.
A comparison of approaches to large-scale data analysis.
In Proc. ACM SIGMOD Int'l Conf. on Management of Data
(SIGMOD'09), pages 165-178, 2009.
[ bib |
DOI |
.pdf ]
|
|
[thsa09]
|
Ashish Thusoo, Joydeep Sen Sarma, Namit Jain, Zheng Shao, Prasad Chakka, Suresh
Anthony, Hao Liu, Pete Wyckoff, and Raghotham Murthy.
Hive - a warehousing solution over a map-reduce framework.
Proc. of the VLDB Endowment, 2(2):1626-1629, 2009.
[ bib |
.pdf |
.pdf ]
|
|
[webo09]
|
Craig D. Weissman and Steve Bobrowski.
The design of the Force.com multitenant internet application
development platform.
In Proc. ACM SIGMOD Int'l Conference on Management of Data
(SIGMOD), pages 889-896, 2009.
[ bib |
DOI |
.pdf ]
|
|
[coha08]
|
Graham Cormode and Marios Hadjieleftheriou.
Finding frequent items in data streams.
In Proc. Int'l Conference on Very Large Data Bases (VLDB'08),
August 2008.
[ bib |
.pdf ]
|
|
[selt08]
|
Margo Seltzer.
Beyond relational databases.
ACM Queue, 51(7):52-58, July 2008.
[ bib |
DOI |
.pdf ]
Argues for modular and configurable DBMS to address new applications:
warehousing, directory services, web search, mobile device caching,
XML, streams.
|
|
[prit08]
|
Dan Pritchett.
BASE: An acid alternative.
ACM Queue, 6(3):48-55, May 2008.
[ bib |
DOI |
.pdf ]
|
|
[abma08]
|
Daniel J. Abadi, Samuel Madden, and Nabil Hachem.
Column-stores vs. row-stores: How different are they really?
In Proc. ACM SIGMOD Int'l Conf. on Management of Data, pages
967-980, 2008.
[ bib |
.pdf ]
|
|
[aggo08]
|
Marcos K. Aguilera, Wojciech M. Golab, and Mehul A. Shah.
A practical scalable distributed b-tree.
Proc. of the VLDB Endowment, 1(1):598-609, 2008.
[ bib |
.pdf |
.pdf ]
|
|
[augr08]
|
Stefan Aulbach, Torsten Grust, Dean Jacobs, Alfons Kemper, and Jan Rittinger.
Multi-tenant databases for software as a service: schema-mapping
techniques.
In Pro. ACM SIGMOD Int'l Conference on Management of Data,
pages 1195-1206, 2008.
[ bib |
DOI |
.pdf ]
|
|
[brfl08]
|
Matthias Brantner, Daniela Florescu, David Graf, Donald Kossmann, and Tim
Kraska.
Building a database on S3.
In Proc. ACM SIGMOD Int'l Conference on Management of Data
(SIGMOD), pages 251-264, 2008.
[ bib |
DOI |
.pdf ]
|
|
[caro08]
|
Michael J. Cahill, Uwe Röhm, and Alan D. Fekete.
Serializable isolation for snapshot databases.
In Proc. ACM SIGMOD Int'l Conference on Management of Data
(SIGMOD), pages 729-738, 2008.
[ bib |
DOI |
.pdf ]
|
|
[depa08]
|
David J. DeWitt, Erik Paulson, Eric Robinson, Jeffrey F. Naughton, Joshua
Royalty, Srinath Shankar, and Andrew Krioukov.
Clustera: an integrated computation and data management system.
Proc. of the VLDB Endowment, 1(1):28-41, 2008.
[ bib |
.pdf |
.pdf ]
|
|
[kovi08]
|
Ioannis Koltsidas and Stratis D. Viglas.
Flashing up the storage layer.
Proc. of the VLDB Endowment, 1(1):514-525, 2008.
[ bib |
DOI |
.pdf |
.pdf ]
Considers architecture with both flash and magnetic disk available
for persistent storage. Each block lives persistently either on disk
or on flash, not both. Assumes there is a demand-paged in-memory
block cache that makes a placement decision on eviction of a dirty
page. Proposed placement algorithms count page reads and writes uses
the counts, as well as the costs of read and write operations on
disk and flash, to decide where to place an evicted page. Placement
decisions are made independently for each page. In particular, there
are no capacity constraints and thus the algorithms may choose to
place all blocks on the same device. Proposed cache replacement algorithm
keeps some number of least-recently-used pages in four queues corresponding
to whether the page is clean or dirty and whether the page is located
on flash or disk. Always evict the page with the lowest eviction
cost from among these least-recently used pages.
|
|
[kesh07]
|
S. Keshav.
How to read a paper.
ACM SIGCOMM Computer Communication Review, 37(3):83-84, July
2007.
[ bib |
http |
.pdf ]
|
|
[laju07]
|
Pepijn de Langen and Ben H. H. Juurlink.
Trade-offs between voltage scaling and processor shutdown for
low-energy embedded multiprocessors.
In Int'l Workshop on Embedded Computer Systems: Architectures,
Modeling, and Simulation, number 4599 in Lecture Notes in Computer Science.
Springer-Verlag, July 2007.
[ bib |
.pdf ]
|
|
[stke07]
|
Christopher Stewart, Terence Kelly, and Alex Zhang.
Exploiting nonstationarity for performance prediction.
In Proc. EuroSys 2007, pages 31-46, March 2007.
[ bib |
.pdf ]
|
|
[agme07]
|
Marcos K. Aguilera, Arif Merchant, Mehul Shah, Alistair Veitch, and Christos
Karamanolis.
Sinfonia: a new paradigm for building scalable distributed systems.
In Proc. ACM SIGOPS Symposium on Operating Systems Principles
(SOSP), pages 159-174, 2007.
[ bib |
DOI |
.pdf |
.pdf ]
|
|
[grae07]
|
Goetz Graefe.
The five-minute rule twenty years later, and how flash memory changes
the rules.
In Proc. Int'l Workshop on Data Management on New Hardware,
pages 1-9, 2007.
[ bib |
DOI |
.pdf ]
|
|
[hest07]
|
Joseph Hellerstein, Michael Stonebraker, and James Hamilton.
Architecture of a database system.
Foundations and Trends in Databases, 1(2):141-259, 2007.
[ bib |
.pdf |
.pdf ]
|
|
[orac07]
|
Oracle.
Scalability and performance with Oracle 11g database.
Oracle white paper, 2007.
[ bib |
.pdf ]
|
|
[beda06]
|
Philip A. Bernstein, Nishant Dani, Badriddine Khessib, Ramesh Manne, and David
Shutt.
Data management issues in supporting large-scale web services.
Bulletin of the IEEE Technical Committee on Data Engineering,
29(4):3-9, December 2006.
[ bib |
.ps |
.ps ]
|
|
[rale06]
|
Parthasarathy Ranganathan, Phil Leech, David E. Irwin, and Jeffrey S. Chase.
Ensemble-level power management for dense blade servers.
In Proc. International Symposium on Computer Architecture
(ISCA'06), pages 66-77, June 2006.
[ bib |
.pdf |
.pdf ]
Power management for groups (ensembles) of servers, under the assumption
that the servers in a group are likely to require peak power at different
times. Goal is to reduce the amount of power overprovisioning required
for the group.
|
|
[arba06]
|
Arvind Arasu, Shivnath Babu, and Jennifer Widom.
The CQL continuous query language: Semantic foundations and query
execution.
VLDB Journal, 15:121-142, February 2006.
[ bib ]
CQL is the query language implemented by the Stanford STREAM database
system.
|
|
[burr06]
|
Michael Burrows.
The chubby lock service for loosely-coupled distributed systems.
In Proc. of the Symp. on Operating System Design and
Implementation (OSDI'06), pages 335-350, 2006.
[ bib |
.pdf |
.pdf ]
|
|
[crwu06]
|
Sailesh Krishnamurthy, Chung Wu, and Michael Franklin.
On-the-fly sharing for streamed aggregation.
In Proc. ACM SIGMOD International Conference on Management of
Data (SIGMOD'06), pages 623-634, 2006.
[ bib |
DOI |
.pdf ]
|
|
[lova06]
|
David Lomet, Zografoula Vagena, and Roger Barga.
Recovery from "bad" user transactions.
In Proc. ACM SIGMOD Int'l Conference on Management of Data
(SIGMOD'06), pages 337 - 346, 2006.
[ bib |
http |
.pdf ]
|
|
[nive06]
|
Edmund B. Nightingale, Kaushik Veeraraghavan, Peter M. Chen, and Jason Flinn.
Rethink the sync.
In USENIX Symposium on Operating Systems Design and
Implementation (OSDI'06), 2006.
[ bib |
.pdf |
.pdf ]
|
|
[paju06]
|
Seon-yeong Park, Dawoon Jung, Jeong-uk Kang, Jin-soo Kim, and Joonwon Lee.
Cflru: A replacement algorithm for flash memory.
In Proc. Int'l Conf. on Compilers, Architecture and Synthesis
for Embedded Systems, pages 234-241, 2006.
[ bib |
DOI |
.pdf ]
|
|
[wudi06]
|
Eugene Wu, Yanlei Diao, and Shariq Rizvi.
High-performance complex event processing over streams.
In Proc. ACM SIGMOD International Conference on Management of
Data (SIGMOD'06), pages 407-418, 2006.
[ bib |
DOI |
.pdf ]
|
|
[xigo05]
|
Man Xiong, Brian Goldstein, and Chris Auger.
Scaling out SQL Server with data-dependent routing.
Dell Power Solutions, August 2005.
[ bib |
.pdf |
.pdf ]
|
|
[waro05]
|
Andrew Warfield, Russ Ross, Keir Fraser, Christian Limpach, and Steven Hand.
Parallax: managing storage for a million machines.
In Proc. USENIX Hot Topics in Operating Systems (HOTOS'05),
June 2005.
[ bib |
.pdf |
.pdf ]
Block level storage virtualization targeted at virtual machines.
Uses copy-on-write and trie-based block indexing to support versioned
device images. Virtualization is implemented in dedicated virtual
machines, one for each node in a cluster.
|
|
[moch05]
|
Justin D. Moore, Jeffrey S. Chase, Parthasarathy Ranganathan, and Ratnesh K.
Sharma.
Making scheduling "cool": Temperature-aware workload placement in
data centers.
In Proc. USENIX Annual Technical Conference, pages 61-75,
April 2005.
[ bib |
.pdf |
.pdf ]
|
|
[hedi05]
|
Taliver Heath, Bruno Diniz, Enrique V. Carrera, Wagner Meira Jr., and Ricardo
Bianchini.
Energy conservation in heterogeneous server clusters.
In Proc. ACM SIGPLAN Symposium on Principles and Practice of
Parallel Programming (PPoPP'05), pages 186-195, 2005.
[ bib |
DOI |
.pdf ]
How to distribute work in a cluster given that different nodes may
have different performance and power characteristics. Objective is
to minimize power consumption per unit of throughtput. Test implementation
is in a cluster web server, and control is achieved by re-distributing
the workload among the cluster nodes. Two distribution mechanisms
are used: a simple front-end load balancer, and a peer-to-peer mechanism
for redistributing requests among servers.
|
|
[meag05]
|
Ahmed Metwally, Divyakant Agrawal, and Amr El Abbadi.
Efficient computation of frequent and top-k elements in data streams.
In Proc. International Conference on Database Theory (ICDT),
January 2005.
[ bib |
.pdf |
.pdf ]
|
|
[stab05]
|
Michael Stonebraker, Daniel J. Abadi, Adam Batkin, Xuedong Chen, Mitch
Cherniack, Miguel Ferreira, Edmond Lau, Amerson Lin, Samuel Madden,
Elizabeth J. O'Neil, Patrick E. O'Neil, Alex Rasin, Nga Tran, and Stanley B.
Zdonik.
C-store: A column-oriented DBMS.
In Proc. Int'l Conf. on Very Large Data Bases, pages 553-564,
2005.
[ bib |
.pdf |
.pdf ]
|
|
[zhha05]
|
Ning Zhang, Peter J. Haas, Vanja Josifovski, Guy M. Lohman, and Chun Zhang.
Statistical learning techniques for costing XML queries.
In Proc. International Conference on Very Large Data Bases
(VLDB'05), pages 289-300, 2005.
[ bib |
.pdf |
.pdf ]
|
|
[zhko05]
|
Rui Zhang, Nick Koudas, Beng Chin Ooi, and Divesh Srivastava.
Multiple aggregations over data streams.
In Proc. ACM SIGMOD International Conference on Management of
Data (SIGMOD'05), pages 299-310, 2005.
[ bib |
DOI |
.pdf ]
|
|
[bhtr04]
|
Suparna Bhattacharya, John Tran, Mike Sullivan, and Chris Mason.
Linux AIO performance and robustness for enterprise workloads.
In Linux Symposium, pages 63-78, 2004.
[ bib |
.pdf ]
|
|
[dihe04]
|
Yixin Diao, Joseph L. Hellerstein, Adam J. Storm, Maheswaran Surendra, Sam
Lightstone, Sujay S. Parekh, and Christian Garcia-Arellano.
Incorporating cost of control into the design of a load balancing
controller.
In IEEE Real-Time and Embedded Technology and Applications
Symposium, 2004.
[ bib |
.pdf ]
|
|
[lech04]
|
Byung Suk Lee, Li Chen, Jeff Buzas, and Vinod Kannoth.
Regression-based self-tuning modeling of smooth user-defined function
costs for an object-relational database management system query optimizer.
The Computer Journal, 47(6):673-693, 2004.
[ bib |
.pdf ]
Builds a cost model by tracking costs of recent UDF invocations,
including their costs and values of cost-related parameters, and
then fitting a model to these data. Includes discussion of statistical
issues like collinearity and removal of outliers and collinearity.
|
|
[mamu04]
|
John MacCormick, Nick Murphy, Marc Najork, Chandramohan A. Thekkath, and Lidong
Zhou.
Boxwood: abstractions as the foundation for storage infrastructure.
In Proc. of the Symp. on Operating System Design and
Implementation (OSDI'04), 2004.
[ bib |
.pdf ]
|
|
[razh04]
|
Amira Rahal, Qiang Zhu, and Per-Ake Larson.
Evolutionary techniques for updating query cost models in a dynamic
multidatabase environment.
VLDB Journal, 13(2):162-176, 2004.
[ bib |
.pdf ]
Considers cost models as linear functions of a set of explanatory
variables for each query class. Initial model is constructed by regression
over an initial set of labeled cost samples. Proposes two methods
to incrementally maintain such models by folding in new samples and
removing the effects of old samples. Assumes that queries from the
application workload are labeled and used to train the model.
|
|
[akam04]
|
A developers guide to on-demand distributed computing.
Akamai white paper, 2004.
[ bib |
.pdf ]
|
|
[pobe03]
|
Rachel Pottinger and Philip A. Bernstein.
Merging models based on given correspondences.
In Proceedings of the 29th International Conference on Very
Large Data Bases, pages 826-873, September 2003.
[ bib |
.pdf |
.pdf ]
|
|
[arha03]
|
Walid G. Aref, Moustafa A. Hammad, Ann Christine Catlin, Ihab F. Ilyas,
Thanaa M. Ghanem, Ahmed K. Elmagarmid, and Mirette S. Marzouk.
Video query processing in the VDBMS testbed for video database
research.
In ACM International Workshop on Multimedia Databases
(MMDB'03), pages 25-32, 2003.
[ bib |
DOI |
.pdf ]
|
|
[crjo03]
|
Charles D. Cranor, Theodore Johnson, Oliver Spatscheck, and Vladislav
Shkapenyuk.
Gigascope: A stream database for network applications.
In Proc. ACM SIGMOD International Conference on Management of
Data (SIGMOD'03), pages 647-651, 2003.
[ bib |
.pdf ]
|
|
[gada03]
|
Lei Gao, Mike Dahlin, Amol Nayate, Jiandan Zheng, and Arun Iyengar.
Application specific data replication for edge services.
In Proc. Int'l Conf. on World Wide Web (WWW'03), pages
449-460, 2003.
[ bib |
DOI |
.pdf ]
|
|
[ghgo03]
|
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung.
The Google file system.
In Proc. Symposium on Operating System Principles (SOSP'03),
pages 29-43, 2003.
[ bib |
.pdf ]
Discusses file system optimized for relatively small number of large
files. Workload is large sequential reads and appends.
Node failures are normal. Throughtput is more important than latency.
Architecture has a single master and many chunk servers.
Master stores metadata (namespace, access controls). Chunks are replicated.
Local storage on each chunk node is via a Linux file system.
Clients do metadata operations through master, then go directly to
chunk servers for data retrieval.
Implements a weak consistency model. Metadata operations are atomic
and serialized. Concurrent writes of the same file
range may get mixed, not serialized. Concurrent appends may lead to
duplication.
|
|
[guel03]
|
Isabelle Guyon and Andre Elisseeff.
An introduction to variable and feature selection.
Journal of Machine Learning Research, 3:1157-1182, 2003.
[ bib |
.pdf ]
|
|
[ilar03]
|
Ihab F. Ilyas, Walid G. Aref, and Ahmed K. Elmagarmid.
Supporting top-k join queries in relational databases.
In Proceedings of 29th International Conference on Very Large
Data Bases (VLDB'03), pages 754-765, 2003.
[ bib |
.pdf |
.pdf ]
Assumes joined tuples are ranked according to a monotone function
of tuple ranks of join inputs. Defines physical join operators that
can produce join results in rank order. Operator needs to queue up
join results until it can be certain that it will produce them in
the proper order.
|
|
[shba03]
|
Ratnesh K. Sharma, Cullen E. Bash, Chandrakant D. Patel, Richard J. Friedrich,
and Jeffrey S. Chase.
Balance of power: Dynamic thermal management for internet data
centers.
Technical Report HPL-2003-5, HP Laboratories, Palo Alto, California,
2003.
[ bib |
.pdf |
.pdf ]
Describes a methodology for thermal load balancing in server rooms.
Thermal imbalances can be caused by imbalanced distribution of server
workload and by peculiarities of the airflow in the server room,
e.g., racks at the end of a row may be hotter than racks in the middle.
Input includes server exhaust temperature readings and cold air temperature.
Local thermal imbalances can be corrected by adjusting the allocation
of work to the various servers.
|
|
[brko02]
|
Nicolas Bruno, Nick Koudas, and Divesh Srivastava.
Holistic twig joins: optimal XML pattern matching.
In Proceedings of the 2002 ACM SIGMOD International Conference
on Management of Data, pages 310-321, 2002.
[ bib |
.pdf ]
A technique for finding twig query matches without first matching
individual binary subrelationships in the twig, i.e., this is an
N-way structural join. Description of related work is a succinct
classification of previous work on twig query processing.
|
|
[gily02]
|
Seth Gilbert and Nancy Lynch.
Brewer's conjecture and the feasibility of consistent, available,
partition-tolerant web services.
SIGACT News, 33(2):51-59, 2002.
[ bib |
DOI |
.pdf ]
|
|
[mena02]
|
Daniel A. Menascé.
TPC-W: A benchmark for E-Commerce.
IEEE Internet Computing, 6(3):83-87, 2002.
[ bib |
.pdf ]
|
|
[pibi01]
|
Eduardo Pinheiro, Ricardo Bianchini, Enrique Carrera, and Taliver Heath.
Load balancing and unbalancing for power and performance in
cluster-based systems.
In Proc. Workshop on Compilers and Operating Systems for Low
Power, September 2001.
[ bib |
.ps.gz |
.ps.gz ]
Automatic power management in a cluster of servers by concentrating
load on as few machines as possible and turning others off. Implemented
in a web server and in a cluster operation system.
|
|
[boco01]
|
P. Bohrer, D. Cohn, E.N. Elnozahy, T. Keller, M. Kistler, C. Lefurgy,
R. Rajamony, F. Rawson, and E. V. Hensbergen.
Energy conservation for servers.
In Proc. IEEE Workshop on Power Management for Real-Time and
Embedded Systems, May 2001.
[ bib |
.pdf |
.pdf ]
A brief general overview of the the problem of energy conservation
in data centers.
|
|
[horn01]
|
Paul Horn.
autonomic computing: IBM's perspective on the state of information
technology.
Technical report, International Business Machines Corporation,
Armonk, NY, USA, 2001.
[ bib |
.pdf ]
|
|
[poha01]
|
Rachel Pottinger and Alon Y. Halevy.
MiniCon: A scalable algorithm for answering queries using views.
VLDB Journal, 10(2-3):182-198, 2001.
[ bib |
.pdf |
.pdf ]
|
|
[rodr01]
|
Antony I. T. Rowstron and Peter Druschel.
Pastry: Scalable, decentralized object location, and routing for
large-scale peer-to-peer systems.
In Proc. IFIP/ACM International Conference on Distributed
Systems Platforms (Middleware'01), pages 329-350, 2001.
[ bib |
.pdf ]
|
|
[stmo01]
|
Ion Stoica, Robert Morris, David R. Karger, M. Frans Kaashoek, and Hari
Balakrishnan.
Chord: A scalable peer-to-peer lookup service for internet
applications.
In Proc. ACM SIGCOMM Conference, pages 149-160, 2001.
[ bib |
DOI |
.pdf |
.pdf ]
|
|
[brew00]
|
Eric A. Brewer.
Towards robust distributed systems.
Keynote presentation, ACM Symposium on Principles of Distrbuted
Computing (PODC), July 2000.
[ bib |
.pdf |
.pdf ]
Presentation of the CAP conjecture.
|
|
[beha00]
|
Philip A. Bernstein, Alon Y. Halevy, and Rachel Pottinger.
A vision of management of complex models.
SIGMOD Record, 29(4):55-63, 2000.
[ bib |
.pdf |
.pdf ]
|
|
[grbr00]
|
Steven D. Gribble, Eric A. Brewer, Joseph M. Hellerstein, and David E. Culler.
Scalable, distributed data structures for internet service
construction.
In Proc. of the Symp. on Operating System Design and
Implementation (OSDI'00), pages 319-332, 2000.
[ bib |
.pdf ]
|
|
[yuva00]
|
Haifeng Yu and Amin Vahdat.
Design and evaluation of a continuous consistency model for
replicated services.
In Proc. of the Symp. on Operating System Design and
Implementation (OSDI'00), pages 21-21, 2000.
[ bib |
.pdf ]
|
|
[grgr97]
|
Jim Gray and Goetz Graefe.
The five-minute rule ten years later, and other computer storage
rules of thumb.
SIGMOD Record, 26(4):63-68, December 1997.
[ bib |
DOI ]
|
|
[pesp97]
|
Karin Petersen, Mike J. Spreitzer, Douglas B. Terry, Marvin M. Theimer, and
Alan J. Demers.
Flexible update propagation for weakly consistent replication.
In Proc. of the ACM Symp. on Operating Systems Principles
(SOSP'97), pages 288-301, 1997.
[ bib |
DOI |
.pdf ]
|
|
[degr92]
|
David J. DeWitt and Jim Gray.
Parallel database systems: The future of high-performance database
systems.
Communications of the ACM, 35(6):85-98, 1992.
[ bib |
.pdf ]
Discusses scale-up and speed-up as two distinct parallelism objectives.
Discusses shared-memory, shared-disk, and shared-nothing architectures
and argues that the latter will provide the best scalability because
it places the least demands on the interconnection network because
interaction is minmized. Discusses data partitioning and parallelization
of relational query operators.
|
|
[grae90]
|
Goetz Graefe.
Encapsulation of parallelism in the volcano query processing system.
In Proc. ACM SIGMOD Int'l Conf. on Management of Data, pages
102-111, 1990.
[ bib |
DOI |
.pdf ]
|
|
[grpu86]
|
Jim Gray and Franco Putzolu.
The 5 minute rule for trading memory for disk accesses and the 5 byte
rule for trading memory for cpu time.
Technical Report 86.1, Tandem Computers, May 1986.
Original report was May 1985.
[ bib |
.pdf |
.pdf ]
|
|
[lamp78]
|
Leslie Lamport.
Time, clocks and the ordering of events in a distributed system.
Communications of the ACM, 21(7):558-565, July 1978.
[ bib |
.pdf ]
|
|
[mage70]
|
R. L. Mattson, J. Gecsei, D. R. Slutz, and I. L. Traiger.
Evaluation techniques for storage hierarchies.
IBM Systems Journal, 9(2):78-117, June 1970.
[ bib |
DOI |
.pdf ]
Includes a proof of optimality of the MIN algorithm.
|
|
[xero11]
|
Xeround.
Xeround cloud database, part 1 - technology.
Xeround white paper.
downloaded March 2011.
[ bib |
.pdf |
.pdf ]
Multiple MySQL front ends, replicated partitioned data. Assignment
of partitions to nodes can be adjusted to support elastic scale-out.
Supports distributed query execution. Company offers database service,
rather than software.
|