Title: | Benchmark and Frontier Analysis Using DEA and SFA |
---|---|
Description: | Methods for frontier analysis, Data Envelopment Analysis (DEA), under different technology assumptions (fdh, vrs, drs, crs, irs, add/frh, and fdh+), and using different efficiency measures (input based, output based, hyperbolic graph, additive, super, and directional efficiency). Peers and slacks are available, partial price information can be included, and optimal cost, revenue and profit can be calculated. Evaluation of mergers is also supported. Methods for graphing the technology sets are also included. There is also support for comparative methods based on Stochastic Frontier Analyses (SFA) and for convex nonparametric least squares of convex functions (STONED). In general, the methods can be used to solve not only standard models, but also many other model variants. It complements the book, Bogetoft and Otto, Benchmarking with DEA, SFA, and R, Springer-Verlag, 2011, but can of course also be used as a stand-alone package. |
Authors: | Peter Bogetoft and Lars Otto |
Maintainer: | Lars Otto <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.32 |
Built: | 2025-02-12 05:41:26 UTC |
Source: | https://github.com/cran/Benchmarking |
The Benchmarking package contains methods to estimate technologies and measure efficiencies using DEA and SFA. Data Envelopment Analysis (DEA) are supported under different technology assumptions (fdh, vrs, drs, crs, irs, add), and using different efficiency measures (input based, output based, hyperbolic graph, additive, super, directional). Peers are available, partial price information can be included, and optimal cost, revenue and profit can be calculated. Evaluation of mergers are also supported. Comparative methods for estimating stochastic frontier function (SFA) efficiencies and for convex nonparametric least squares here for convex functions (StoNED) are also included. The methods can solve not only standard models, but also many other model variants, and they can be modified to solve new models.
The package also support simple plots of DEA technologies with two goods; either as a transformation curve (2 outputs), an isoquant (2 inputs), or a production function (1 input and 1 output). When more inputs and outputs are available they are aggregated using weights (prices, relative prices).
The package complements the book, Bogetoft and Otto, Benchmarking with DEA, SFA, and R, Springer-Verlag 2011, but can of course also be used as a stand-alone package.
Package: | Benchmarking |
Type: | Package |
Version: | 0.30 ($Revision: 233 $) |
Date: | $Date: 2020-08-10 18:43:17 +0200 (ma, 10 aug 2020) $ |
License: | Copyright |
dea |
DEA input or output efficience measures, peers, lambdas and slacks |
dea.dual |
Dual weights (prices), including restrictions on weights |
dea.direct |
Directional efficiency |
sdea |
Super efficiency. |
dea.add |
Additive efficiency; sum of slacks in DEA technology. |
mea |
Multidirectional efficiency analysis or potential improvements. |
eff |
Efficiency from an object returned from any of the dea or sfa functions. |
slack |
Slacks in DEA models |
excess |
Calculates excess input or output compared to DEA frontier. |
peers |
get the peers for each firm. |
dea.boot |
Bootstrap DEA models |
cost.opt |
Optimal input for given output and prices. |
revenue.opt |
Optimal output for given input and prices. |
profit.opt |
Optimal input and output for given input and output prices. |
dea.plot |
Graphs of DEA technologies under alternative technology assumptions. |
dea.plot.frontier |
Specialized for 1 input and 1 output. |
dea.plot.isoquant |
Specialized for 2 inputs. |
dea.plot.transform |
Specialized for 2 outputs. |
eladder |
Efficiency ladder for a single firm. |
eladder.plot |
Plot efficiency ladder for a single firm. |
make.merge |
Make an aggregation matrix to perform mergers. |
dea.merge |
Decompose efficiency from a merger of firms |
sfa |
Stochastic frontier analysis, production, distance, and cost function (SFA) |
stoned |
Convex nonparametric least squares here for convex function function |
outlierC.ap, outlier.ap |
Detection of outliers |
eff.dens |
Estimate and plot kernel density of efficiencies |
critValue |
Critical values calculated from bootstrap DEA models. |
typeIerror |
Probability of a type I error for a test in bootstrap DEA models. |
The interface for the methods are very much like the interface to the
methods in the package FEAR (Wilson 2008). One change is that
the data now are transposed to reflect how data is usually available
in applications, i.e. we have firms on rows, and inputs and output in
the columns. Also, the argument for the options RTS
and
ORIENTATION
can be given as memotechnical strings, and there
are more options to control output.
The input and output matrices can contain negative numbers, and the methods can thereby manage restricted or fixed input or output.
The return is not just the efficiency, but also slacks, dual values (shadow prices), peers, and lambdas (weights).
Peter Bogetoft and Lars Otto [email protected]
Bogetoft and Otto; Benchmarking with DEA, SFA, and R; Springer 2011
Paul W. Wilson (2008), “FEAR 1.0: A Software Package for Frontier Efficiency Analysis with R,” Socio-Economic Planning Sciences 42, 247–254
# Plot of different technologies x <- matrix(c(100,200,300,500),ncol=1,dimnames=list(LETTERS[1:4],"x")) y <- matrix(c(75,100,300,400),ncol=1,dimnames=list(LETTERS[1:4],"y")) dea.plot(x,y,RTS="vrs",ORIENTATION="in-out",txt=rownames(x)) dea.plot(x,y,RTS="drs",ORIENTATION="in-out",add=TRUE,lty="dashed",lwd=2) dea.plot(x,y,RTS="crs",ORIENTATION="in-out",add=TRUE,lty="dotted") dea.plot(x,y,RTS="fdh",ORIENTATION="in-out",txt=rownames(x),main="fdh") dea.plot(x,y,RTS="irs",ORIENTATION="in-out",txt=TRUE,main="irs") dea.plot(x,y,RTS="irs2",ORIENTATION="in-out",txt=rownames(x),main="irs2") dea.plot(x,y,RTS="add",ORIENTATION="in-out",txt=rownames(x),main="add") # A quick frontier with 1 input and 1 output dea.plot(x,y, main="Basic plot of frontier") # Calculating efficiency dea(x,y, RTS="vrs", ORIENTATION="in") e <- dea(x,y, RTS="vrs", ORIENTATION="in") e eff(e) peers(e) peers(e, NAMES=TRUE) print(peers(e, NAMES=TRUE), quote=FALSE) lambda(e) summary(e) # Calculating super efficiency esuper <- sdea(x,y, RTS="vrs", ORIENTATION="in") esuper print(peers(esuper,NAMES=TRUE),quote=FALSE) # Technology for super efficiency for firm number 3/C # Note that drop=FALSE is necessary for XREF and YREF to be matrices # when one of the dimensions is or is reduced to 1. e3 <- dea(x,y, XREF=x[-3,,drop=FALSE], YREF=y[-3,,drop=FALSE]) dea.plot(x[-3],y[-3],RTS="vrs",ORIENTATION="in-out",txt=LETTERS[c(1,2,4)]) points(x[3],y[3],cex=2) text(x[3],y[3],LETTERS[3],adj=c(-.75,.75)) e3 <- dea(x,y, XREF=x[-3,,drop=FALSE], YREF=y[-3,,drop=FALSE]) eff(e3) peers(e3) print(peers(e3,NAMES=TRUE),quote=FALSE) lambda(e3) e3$lambda # Taking care of slacks x <- matrix(c(100,200,300,500,100,600),ncol=1, dimnames=list(LETTERS[1:6],"x")) y <- matrix(c(75,100,300,400,50,400),ncol=1, dimnames=list(LETTERS[1:6],"y")) # Phase one, calculate efficiency e <- dea(x,y) print(e) peers(e) lambda(e) # Phase two, calculate slacks (maximize sum of slacks) sl <- slack(x,y,e) data.frame(sl$sx,sl$sy) peers(sl) lambda(sl) sl$lambda summary(sl) # The two phases in one function call e2 <- dea(x,y,SLACK=TRUE) print(e2) data.frame(eff(e2),e2$slack,e2$sx,e2$sy,lambda(e2)) peers(e2) lambda(e2) e2$lambda
# Plot of different technologies x <- matrix(c(100,200,300,500),ncol=1,dimnames=list(LETTERS[1:4],"x")) y <- matrix(c(75,100,300,400),ncol=1,dimnames=list(LETTERS[1:4],"y")) dea.plot(x,y,RTS="vrs",ORIENTATION="in-out",txt=rownames(x)) dea.plot(x,y,RTS="drs",ORIENTATION="in-out",add=TRUE,lty="dashed",lwd=2) dea.plot(x,y,RTS="crs",ORIENTATION="in-out",add=TRUE,lty="dotted") dea.plot(x,y,RTS="fdh",ORIENTATION="in-out",txt=rownames(x),main="fdh") dea.plot(x,y,RTS="irs",ORIENTATION="in-out",txt=TRUE,main="irs") dea.plot(x,y,RTS="irs2",ORIENTATION="in-out",txt=rownames(x),main="irs2") dea.plot(x,y,RTS="add",ORIENTATION="in-out",txt=rownames(x),main="add") # A quick frontier with 1 input and 1 output dea.plot(x,y, main="Basic plot of frontier") # Calculating efficiency dea(x,y, RTS="vrs", ORIENTATION="in") e <- dea(x,y, RTS="vrs", ORIENTATION="in") e eff(e) peers(e) peers(e, NAMES=TRUE) print(peers(e, NAMES=TRUE), quote=FALSE) lambda(e) summary(e) # Calculating super efficiency esuper <- sdea(x,y, RTS="vrs", ORIENTATION="in") esuper print(peers(esuper,NAMES=TRUE),quote=FALSE) # Technology for super efficiency for firm number 3/C # Note that drop=FALSE is necessary for XREF and YREF to be matrices # when one of the dimensions is or is reduced to 1. e3 <- dea(x,y, XREF=x[-3,,drop=FALSE], YREF=y[-3,,drop=FALSE]) dea.plot(x[-3],y[-3],RTS="vrs",ORIENTATION="in-out",txt=LETTERS[c(1,2,4)]) points(x[3],y[3],cex=2) text(x[3],y[3],LETTERS[3],adj=c(-.75,.75)) e3 <- dea(x,y, XREF=x[-3,,drop=FALSE], YREF=y[-3,,drop=FALSE]) eff(e3) peers(e3) print(peers(e3,NAMES=TRUE),quote=FALSE) lambda(e3) e3$lambda # Taking care of slacks x <- matrix(c(100,200,300,500,100,600),ncol=1, dimnames=list(LETTERS[1:6],"x")) y <- matrix(c(75,100,300,400,50,400),ncol=1, dimnames=list(LETTERS[1:6],"y")) # Phase one, calculate efficiency e <- dea(x,y) print(e) peers(e) lambda(e) # Phase two, calculate slacks (maximize sum of slacks) sl <- slack(x,y,e) data.frame(sl$sx,sl$sy) peers(sl) lambda(sl) sl$lambda summary(sl) # The two phases in one function call e2 <- dea(x,y,SLACK=TRUE) print(e2) data.frame(eff(e2),e2$slack,e2$sx,e2$sy,lambda(e2)) peers(e2) lambda(e2) e2$lambda
The data set is from an US federally sponsored program for providing
remedial assistance to disadvantaged primary school students. The
firms are 70 school sites, and data are from entire sites. The
variables consists of results from three different kind of tests, a
reading score, y1
, a math score, y2
, and a self–esteem
score, y3
, which are considered outputs in the model, and five
different variables considered to be inputs, the education level of
the mother, x1
, the highest occupation of a family member,
x2
, parental visits to school, x3
, time spent with
children in school-related topics, x4
, and the number of
teachers at the site, x5
.
data(charnes1981)
data(charnes1981)
A data frame with 70 school sites with the following variables.
firm
school site number
x1
education level of the mother
x2
highest occupation of a family member
x3
parental visits to school
x4
time spent with children in school-related topics
x5
the number of teachers at the site
y1
reading score
y2
math score
y3
self–esteem score
pft
=1 if in program (program follow through) and =0 if not in program
name
Site name
The command data(charnes1981)
will create a data
frame named charnes1981
with the above data.
Beside input and output varianles there is further information in the
data set, that the first 50 school sites followed the program and that
the last 20 are the results for sites not following the program. This
is showed by the variable pft
.
Data as .csv
are loaded by the command data
using
read.table(..., header=TRUE, sep=";")
such that this file
is a semicolon separated file and not a comma separated file.
Therefore, to read the file from a script the command must be
read.csv("charnes1981.csv", sep=";")
or
read.csv2("charnes1981.csv")
.
Thus the data can be read either as charnes1981 <-
read.csv2(paste(.Library, "Benchmarking/data",
"charnes1981.csv", sep ="/"))
or as data(charnes1981)
if
the package Benchmarking is loaded. In both cases the data will
be in the data frame charnes1981
.
Charnes, Cooper, and Rhodes, “Evaluating Program and Managerial Efficiency: An Application of Data Envelopment Analysis to Program Follow Through”, Management Science, volume 27, number 6, June 1981, pages 668–697.
data(charnes1981) x <- with(charnes1981, cbind(x1,x2,x3,x4,x5)) y <- with(charnes1981, cbind(y1,y2,y3)) # Farrell inpout efficiency; vrs technology e <- dea(x,y) # The number of times each peer is a peer np <- get.number.peers(e) # Peers that are peers for more than 20 schools, and the number of # times they are peers np[which(np[,2]>20),] # Plot first input against first output and emphasize the peers that # are peers for more than 20 schools in the model with five inputs and # three outputs inp <- np[which(np[,2]>20),1] dea.plot(x[,1],y[,1]) points(x[inp,1], y[inp,1], pch=16, col="red")
data(charnes1981) x <- with(charnes1981, cbind(x1,x2,x3,x4,x5)) y <- with(charnes1981, cbind(y1,y2,y3)) # Farrell inpout efficiency; vrs technology e <- dea(x,y) # The number of times each peer is a peer np <- get.number.peers(e) # Peers that are peers for more than 20 schools, and the number of # times they are peers np[which(np[,2]>20),] # Plot first input against first output and emphasize the peers that # are peers for more than 20 schools in the model with five inputs and # three outputs inp <- np[which(np[,2]>20),1] dea.plot(x[,1],y[,1]) points(x[inp,1], y[inp,1], pch=16, col="red")
Estimates the input and/or output vector(s) that minimize cost, maximize revenue or maximize profit in the context of a DEA technology
cost.opt(XREF, YREF, W, YOBS=NULL, RTS="vrs", param=NULL, TRANSPOSE=FALSE, LP=FALSE, CONTROL=NULL, LPK = NULL) revenue.opt(XREF, YREF, P, XOBS=NULL, RTS="vrs", param=NULL, TRANSPOSE = FALSE, LP = FALSE, CONTROL=NULL, LPK = NULL) profit.opt(XREF, YREF, W, P, RTS = "vrs", param=NULL, TRANSPOSE = FALSE, LP = FALSE, CONTROL=NULL, LPK = NULL)
cost.opt(XREF, YREF, W, YOBS=NULL, RTS="vrs", param=NULL, TRANSPOSE=FALSE, LP=FALSE, CONTROL=NULL, LPK = NULL) revenue.opt(XREF, YREF, P, XOBS=NULL, RTS="vrs", param=NULL, TRANSPOSE = FALSE, LP = FALSE, CONTROL=NULL, LPK = NULL) profit.opt(XREF, YREF, W, P, RTS = "vrs", param=NULL, TRANSPOSE = FALSE, LP = FALSE, CONTROL=NULL, LPK = NULL)
XREF |
Input of the firms defining the technology, a K x m matrix
of observations of K firms with m inputs (firm x input). In case
|
|||||||||||||||||||||
YREF |
output of the firms defining the technology, a K x n
matrix of observations of K firms with n outputs (firm x input). In
case |
|||||||||||||||||||||
W |
Input prices as a matrix. Either same prices for all firms or individual prices for all firms, i.e. either a 1 x m or a K x m matrix for K firms and m inputs |
|||||||||||||||||||||
P |
Output prices as a matrix. Either same prices for all firms or individual prices for all firms, i.e. either a 1 x n or K x n matrix for K firms and n outputs |
|||||||||||||||||||||
XOBS |
The input for which an optimal, revenue maximizing, output
vector is to be calculated. Defaults is |
|||||||||||||||||||||
YOBS |
The output for which an optimal, cost minimizing input
vector is to be calculated. Defaults is |
|||||||||||||||||||||
RTS |
A text string or a number defining the underlying DEA technology / returns to scale assumption.
|
|||||||||||||||||||||
param |
Possible parameters. Now only used for
RTS="fdh+" to set low and high values for restrictions on lambda;
see the section details and examples in |
|||||||||||||||||||||
TRANSPOSE |
Input and output matrices are treated as firms times
goods for the default value |
|||||||||||||||||||||
LP |
Only for debugging. If |
|||||||||||||||||||||
CONTROL |
Possible controls to lpSolveAPI, see the
documentation for that package. For examples of use see the
function |
|||||||||||||||||||||
LPK |
When |
Input and output matrices are in the same form as for the
method dea
.
The LP optimization problem is formulated in Bogetoft and Otto (2011, pp 35 and 102) and is solved by the LP method in the package lpSolveAPI.
The methods print
and summary
are working for
cost.opt
, revenue.opt
, and profit.opt
The values returned are the optimal input, and/or optimal output. When saved in an object the following components are available:
xopt |
The optimal input, returned as a matrix by
|
yopt |
The optimal output, returned as a matrix by
|
cost |
The optimal/minimal cost. |
revenue |
The optimal/maximal revenue |
profit |
The optimal/maximal profit |
lambda |
The peer weights that determines the technology, a matrix. Each row is the lambdas for the firm corresponding to that row; for the vrs technology the rows sum to 1. A column shows for a given firm how other firms are compared to this firm, i.e. peers are firms with a positive element in their columns. |
The index for peer units can be returned by the method peers
and the weights are returned in lambda
. Note that the peers
now are the firms for the optimal input and/or output allocation, not
just the technical efficient firms.
If a numerical problem occurs, status=5, or if no solution can be found,
the best solution is often to scale the input X
and output
Y
yourself or use the option CONTROL
to change scaling in
the program itself, as described in the notes for dea
.
Peter Bogetoft and Lars Otto [email protected]
Bogetoft and Otto; Benchmarking with DEA, SFA, and R; Springer 2011
Paul W. Wilson (2008), “FEAR 1.0: A Software Package for Frontier Efficiency Analysis with R,” Socio-Economic Planning Sciences 42, 247–254
x <- matrix(c(2,12, 2,8, 5,5, 10,4, 10,6, 3,13), ncol=2, byrow=TRUE) y <- matrix(1,nrow=dim(x)[1],ncol=1) w <- matrix(c(1.5, 1),ncol=2) txt <- LETTERS[1:dim(x)[1]] dea.plot(x[,1],x[,2], ORIENTATION="in", cex=1.25) text(x[,1],x[,2],txt,adj=c(-.7,-.2),cex=1.25) # technical efficiency te <- dea(x,y,RTS="vrs") xopt <- cost.opt(x,y,w,RTS=1) cobs <- x %*% t(w) copt <- xopt$x %*% t(w) # cost efficiency ce <- copt/cobs # allocaltive efficiency ae <- ce/te$eff data.frame("ce"=ce,"te"=te$eff,"ae"=ae) print(cbind("ce"=c(ce),"te"=te$eff,"ae"=c(ae)),digits=2) # isocost line in the technology plot abline(a=copt[1]/w[2], b=-w[1]/w[2], lty="dashed")
x <- matrix(c(2,12, 2,8, 5,5, 10,4, 10,6, 3,13), ncol=2, byrow=TRUE) y <- matrix(1,nrow=dim(x)[1],ncol=1) w <- matrix(c(1.5, 1),ncol=2) txt <- LETTERS[1:dim(x)[1]] dea.plot(x[,1],x[,2], ORIENTATION="in", cex=1.25) text(x[,1],x[,2],txt,adj=c(-.7,-.2),cex=1.25) # technical efficiency te <- dea(x,y,RTS="vrs") xopt <- cost.opt(x,y,w,RTS=1) cobs <- x %*% t(w) copt <- xopt$x %*% t(w) # cost efficiency ce <- copt/cobs # allocaltive efficiency ae <- ce/te$eff data.frame("ce"=ce,"te"=te$eff,"ae"=ae) print(cbind("ce"=c(ce),"te"=te$eff,"ae"=c(ae)),digits=2) # isocost line in the technology plot abline(a=copt[1]/w[2], b=-w[1]/w[2], lty="dashed")
Calculates critical value for test using bootstrap output in DEA models
critValue(s, alpha=0.05)
critValue(s, alpha=0.05)
s |
Vector with calculated values of the statistic for each of
the |
alpha |
The size of the test |
Needs bootstrapped values of the test statistic
Returns the critical value
Peter Bogetoft and Lars Otto [email protected]
boot.sw98
in FEAR, Paul W. Wilson (2008),
“FEAR 1.0: A Software Package for Frontier Efficiency Analysis
with R,” Socio-Economic Planning Sciences 42, 247–254
# The critical value for two-sided test in normal distribution found # by simulation. x <- rnorm(1000000) critValue(x,.975)
# The critical value for two-sided test in normal distribution found # by simulation. x <- rnorm(1000000) critValue(x,.975)
Estimates a DEA frontier and calculates efficiency measures a la Farrell.
dea(X, Y, RTS="vrs", ORIENTATION="in", XREF=NULL, YREF=NULL, FRONT.IDX=NULL, SLACK=FALSE, DUAL=FALSE, DIRECT=NULL, param=NULL, TRANSPOSE=FALSE, FAST=FALSE, LP=FALSE, CONTROL=NULL, LPK=NULL) ## S3 method for class 'Farrell' print(x, digits=4, ...) ## S3 method for class 'Farrell' summary(object, digits=4, ...)
dea(X, Y, RTS="vrs", ORIENTATION="in", XREF=NULL, YREF=NULL, FRONT.IDX=NULL, SLACK=FALSE, DUAL=FALSE, DIRECT=NULL, param=NULL, TRANSPOSE=FALSE, FAST=FALSE, LP=FALSE, CONTROL=NULL, LPK=NULL) ## S3 method for class 'Farrell' print(x, digits=4, ...) ## S3 method for class 'Farrell' summary(object, digits=4, ...)
X |
Inputs of firms to be evaluated, a K x m matrix
of observations of K firms with m inputs (firm x input). In case
|
|||||||||||||||||||||||||||
Y |
Outputs of firms to be evaluated, a K x n matrix
of observations of K firms with n outputs (firm x input). In case
|
|||||||||||||||||||||||||||
RTS |
Text string or a number defining the underlying DEA technology / returns to scale assumption.
|
|||||||||||||||||||||||||||
ORIENTATION |
Input efficiency "in" (1), output efficiency "out"
(2), and graph efficiency "graph" (3). For use with |
|||||||||||||||||||||||||||
XREF |
Inputs of the firms determining the technology, defaults
to |
|||||||||||||||||||||||||||
YREF |
Outputs of the firms determining the technology, defaults
to |
|||||||||||||||||||||||||||
FRONT.IDX |
Index for firms determining the technology |
|||||||||||||||||||||||||||
SLACK |
Calculate slack in a phase II calculation by an intern
call of the function |
|||||||||||||||||||||||||||
DUAL |
Calculate dual variables, i.e. shadow prices; not calculated for orientation graph as that is not an LP problem. |
|||||||||||||||||||||||||||
DIRECT |
Directional efficiency, If the argument is a scalar, the direction is (1,1,...,1) times the scalar; the value of the efficiency depends on the scalar as well as on the unit of measurements. If the argument is an array, this is used for the direction for every
firm; the length of the array must correspond to the number of
inputs and/or outputs depending on the If the argument is a matrix then different directions are used for
each firm. The dimensions depends on the
|
|||||||||||||||||||||||||||
param |
Possible parameters. At the moment only used for
RTS="fdh+" to set low and high values for restrictions on lambda;
see the section details and examples for its use. Future versions
might also use |
|||||||||||||||||||||||||||
TRANSPOSE |
Input and output matrices are treated as firms
times goods matrices for the default value |
|||||||||||||||||||||||||||
LP |
Only for debugging. If |
|||||||||||||||||||||||||||
FAST |
Only calculate efficiencies and just return them as a
vector, i.e. no lambda or other output. The return when using
FAST cannot be used as input for |
|||||||||||||||||||||||||||
CONTROL |
Possible controls to lpSolveAPI, see the
documentation for that package; use |
|||||||||||||||||||||||||||
... |
Optional parameters for the print and summary methods. |
|||||||||||||||||||||||||||
object , x
|
An object of class Farrell (returned by the
function |
|||||||||||||||||||||||||||
digits |
digits in printed output, handled by format in print. |
|||||||||||||||||||||||||||
LPK |
when |
The return from dea
and sdea
is an object of class
Farrell. The efficiency in dea
is calculated by the LP method
in the package lpSolveAPI. Slacks can be calculated either in
the call of dea
using the option SLACK=TRUE
or in a
following call to the function slack
.
The directional efficiency when the argument DIRECT
is used,
depends on the unit of measurement and is not restricted to be less
than 1 (or greater than 1 for output efficiency) and is therefore
completely different from the Farrell efficiency.
The crs factor in RTS="fdh+" that sets the lower and upper bound can
be changed by the argument param
that will set the lower and
upper bound to 1-param and 1+param; the default value is
param=.15. The value must be greater than or equal to 0 and strictly
less than 1. A value of 0 corresponds to RTS="fdh". To get an
asymmetric interval set param to a 2 dimensional array with values for
the low and high end for interval, for instance
param=c(.8,1.15)
. The FDH+ technology set is described in
Bogetoft and Otto (2011) pages 73–74.
The technology RTS="vrs+" uses the parameter param
to set
restrictions on lambda, the convexity parameters. The elements of param
are param=(low, high, sum_low, sum_high)
where "low" and "high"
are restrictions on the individual lambda and "sum_low" and "sum_high" are
restrictions on the sum of lambdas. The individual lambda must be in the
interval from low to high or be zero. With one parameter the restrictions
set are (param, 1+1-(param),1,1)
, with two parameters
(param[1], param[2],1,1)
, and with four parameters
(param[1], param[2],param[3], param[4])
. The resulting technology set is
not necessarily convex.
The graph orientated efficiency is calculated by bisection between feasible and infeasible values of G. The precision in the result is less than for the other orientations.
When the argument DIRECT=d
is used then the returned value
e
for input orientation is the exces input measured in d
units of measurements, i.e. , and for output orientation
. The directional efficency can be restricted to inputs
(
ORIENTAION="in"
), restricted to outputs
(ORIENTAION="out"
), or both include inputs and output
directions (ORIENTAION="in-out"
). Directional efficiency is
discussed on pages 31–35 and 121–127 in Bogetoft and Otto (2011).
The results are returned in a Farrell object with the
following components. The last three components in the list are
only part of the object when SLACK=TRUE
.
eff |
The efficiencies. Note when DIRECT is used then the efficencies are not Farrell efficiencies but rather exces values in DIRECT units of measurement |
lambda |
The lambdas, i.e. the weight of the peers, for each firm |
objval |
The objective value as returned from the LP program;
normally the same as eff, but for |
RTS |
The return to scale assumption as in the option |
ORIENTATION |
The efficiency orientation as in the call |
TRANSPOSE |
As in the call |
slack |
A logical vector where the component for a firm is
|
sum |
A vector with sums of the slacks for each firm. Only
calculated in dea when option |
sx |
A matrix for input slacks for each firm, only calculated if
the option |
sy |
A matrix for output slack, see |
ux |
Dual variable for input, only calculated if |
vy |
Dual variable for output, only calculated if |
The arguments X
, Y
, XREF
, and YREF
are
supposed to be matrices or numerical data frames that in the function
will be converted to matrices. When subsetting a matrix or data frame
to just one column then the class of the resulting object/variable is
no longer a matrix or a data frame, but just a numeric (array,
vector). Therefore, in this case a numeric input that is not a matrix
nor a data frame is transformed to a 1 column matrix, and here the use
of the argument TRANSPOSE=TRUE
gives an error.
The dual values are not unique for extreme points (firms on the boundary with an efficiency of 1) and therefore the calculated dual values for these firms can depend on the order of firms in the reference technology. The same lack of uniqueness also makes the peers for some firms depend on the order of firms in the reference technology.
To calucalte slack use the argument SLACK=TRUE
or use the
function slack
directly.
When there is slack, and slack is not taken into consideration, then the peers for a firm with slack might depend on the order of firms in the data set; this is a property of the LP algorithm used to solve the problem.
To handle fixed, non-discretionary inputs, one can let it appear as
negative output in an input-based mode, and reversely for fixed,
non-discretionary outputs. Fixed inputs (outputs) can also be handled
by directional efficiency; set the direction, the argument
DIRECT
, equal to the variable, discretionary inputs (outputs)
and 0 for the fixed inputs (outputs).
When the the argument DIRECT=X
is used the then the returned
effiency is equal to 1 minus the Farrell efficiency for input
orientation and to the Farrell effiency minus 1 for output
orientation.
To use matrices X
and Y
prepared for the methods in the
package FEAR (Wilson 2008) set the options
TRANSPOSE=TRUE
; for consistency with FEAR the options
RTS
and ORIENTATION
also accepts numbers as in
FEAR.
The tolerance that lambda is zero or one is 1e-7, the default value of
'epsint' in the package lpSolveAPI
, i.e. values closer than
1e-7 from zero or one are set to respective integer value. The 'epsint'
is the tolerance that is used to determine whether a floating-point
number is in fact an in teger. The same tolerance is used for
efficiency value near one.
Some scaling is done in the function, but this does not always work
satisfactory, i.e. sometime, a solution cannot always be found – the
program prints a warning and the efficiency for the firm is set to NA.
Often this is due to a bad scaling of the data. Either the user can try
a different scaling of data when calling the function or one can use the
option CONTROL
to try a different scaling by the program. For
instance one can insert CONTROL=list(scaling=c("geometric", "equilibrate")
or CONTROL=list(scaling=c("curtisreid", "equilibrate", "dynupdate")
in the option list for the function call. The full list of possible
scaling options can be found found from ?lp.control.options
under
"scaling".
If a numerical problem occurs, status=5, the best solution is probably to
scale the input X
and output Y
yourself or use a different
scaling option as desribed above. The best results are obtained when the
variables are close to 1. If some variable are in the millions, then let
the unit of measure be a million.
Peter Bogetoft and Lars Otto [email protected]
Peter Bogetoft and Lars Otto; Benchmarking with DEA, SFA, and R; Springer 2011
Paul W. Wilson (2008), “FEAR 1.0: A Software Package for Frontier Efficiency Analysis with R,” Socio-Economic Planning Sciences 42, 247–254
x <- matrix(c(100,200,300,500,100,200,600),ncol=1) y <- matrix(c(75,100,300,400,25,50,400),ncol=1) dea.plot.frontier(x,y,txt=TRUE) e <- dea(x,y) eff(e) print(e) summary(e) lambda(e) # Input savings potential for each firm (1-eff(e)) * x (1-e$eff) * x # calculate slacks el <- dea(x,y,SLACK=TRUE) data.frame(e$eff,el$eff,el$slack,el$sx,el$sy) # Fully efficient units, eff==1 and no slack which(eff(e) == 1 & !el$slack) # fdh+ with limits in the interval [.7, 1.2] dea(x,y,RTS="fdh+", param=c(.7,1.2))
x <- matrix(c(100,200,300,500,100,200,600),ncol=1) y <- matrix(c(75,100,300,400,25,50,400),ncol=1) dea.plot.frontier(x,y,txt=TRUE) e <- dea(x,y) eff(e) print(e) summary(e) lambda(e) # Input savings potential for each firm (1-eff(e)) * x (1-e$eff) * x # calculate slacks el <- dea(x,y,SLACK=TRUE) data.frame(e$eff,el$eff,el$slack,el$sx,el$sy) # Fully efficient units, eff==1 and no slack which(eff(e) == 1 & !el$slack) # fdh+ with limits in the interval [.7, 1.2] dea(x,y,RTS="fdh+", param=c(.7,1.2))
Calculates additive efficiency as sum of input and output slacks within different DEA models
dea.add(X, Y, RTS="vrs", XREF=NULL, YREF=NULL, FRONT.IDX=NULL, param=NULL, TRANSPOSE=FALSE, LP=FALSE)
dea.add(X, Y, RTS="vrs", XREF=NULL, YREF=NULL, FRONT.IDX=NULL, param=NULL, TRANSPOSE=FALSE, LP=FALSE)
X |
Inputs of firms to be evaluated, a K x m matrix
of observations of K firms with m inputs (firm x input). In case
|
||||||||||||||||||
Y |
Outputs of firms to be evaluated, a K x n matrix
of observations of K firms with n outputs (firm x input). In case
|
||||||||||||||||||
RTS |
Text string or a number defining the underlying DEA technology / returns to scale assumption.
|
||||||||||||||||||
XREF |
Inputs of the firms determining the technology, defaults
to |
||||||||||||||||||
YREF |
Outputs of the firms determining the technology, defaults
to |
||||||||||||||||||
FRONT.IDX |
Index for firms determining the technology |
||||||||||||||||||
param |
Possible parameters. At the moment only used for
RTS="fdh+" to set low and high values for restrictions on lambda;
see the section details and examples for its use. Future versions
might also use |
||||||||||||||||||
TRANSPOSE |
Input and output matrices are treated as firms
times goods matrices for the default value |
||||||||||||||||||
LP |
Only for debugging. If |
The sum of the slacks is maximized in a LP formulation of the DEA technology. The sum of the slacks can be seen as distance to the frontier when you only move parallel to the axes of inputs and outputs, i.e. not a usual Euclidean distance, but what is also known as an L1 norm.
Since it is the sum of slacks that is calculated, there is no exogenous ORIENTATION in the problem. Rather, there is generally both an input and an output direction in the slacks. The model considers the input excess and output shortfall simultaneously and finds a point on the frontier that is most distant to the point being evaluated.
sum |
Sum of all slacks for each firm,
|
slack |
A non-NULL vector of logical variables, |
sx |
A matrix of input slacks for each firm |
sy |
A matrix of output slack for each firm |
lambda |
The lambdas, i.e. the weights of the peers for each firm |
This is neither a Farrell nor a Shephard like efficiency.
The value of the slacks depends on the scaling of the different inputs and outputs. Therefore the values are not independent of how the input and output are measured.
Peter Bogetoft and Lars Otto [email protected]
Corresponds to Eqs. 4.34-4.38 in Cooper et al. (2007)
Bogetoft and Otto; Benchmarking with DEA, SFA, and R; Springer 2011
Cooper, Seiford, and Tone; Data Envelopment Analysis: A Comprehensive Text with Models, Applications, References and DEA-Solver Software; Second edition, Springer 2007
x <- matrix(c(2,3,2,4,6,5,6,8),ncol=1) y <- matrix(c(1,3,2,3,5,2,3,5),ncol=1) dea.plot.frontier(x,y,txt=1:dim(x)[1]) sb <- dea.add(x,y,RTS="vrs") data.frame("sx"=sb$sx,"sy"=sb$sy,"sum"=sb$sum,"slack"=sb$slack)
x <- matrix(c(2,3,2,4,6,5,6,8),ncol=1) y <- matrix(c(1,3,2,3,5,2,3,5),ncol=1) dea.plot.frontier(x,y,txt=1:dim(x)[1]) sb <- dea.add(x,y,RTS="vrs") data.frame("sx"=sb$sx,"sy"=sb$sy,"sum"=sb$sum,"slack"=sb$slack)
The function dea.boot
bootstrap DEA models and
returns bootstrap of Farrell efficiencies. This function is slower than the
boot.sw89
from the package FEAR.
The faster function boot.fear
is a wrapper for boot.sw89
from
the package FEAR returning results directly as Farrell measures.
dea.boot(X, Y, NREP = 200, EFF = NULL, RTS = "vrs", ORIENTATION="in", alpha = 0.05, XREF = NULL, YREF = NULL, FRONT.IDX=NULL, EREF = NULL, DIRECT = NULL, TRANSPOSE = FALSE, SHEPHARD.INPUT = TRUE, LP, CONTROL=NULL) boot.fear(X, Y, NREP = 200, EFF = NULL, RTS = "vrs", ORIENTATION = "in", alpha = 0.05, XREF = NULL, YREF = NULL, EREF = NULL)
dea.boot(X, Y, NREP = 200, EFF = NULL, RTS = "vrs", ORIENTATION="in", alpha = 0.05, XREF = NULL, YREF = NULL, FRONT.IDX=NULL, EREF = NULL, DIRECT = NULL, TRANSPOSE = FALSE, SHEPHARD.INPUT = TRUE, LP, CONTROL=NULL) boot.fear(X, Y, NREP = 200, EFF = NULL, RTS = "vrs", ORIENTATION = "in", alpha = 0.05, XREF = NULL, YREF = NULL, EREF = NULL)
X |
Inputs of firms to be evaluated, a K x m matrix of observations of K firms with m inputs (firm x input) |
Y |
Outputs of firms to be evaluated, a K x n matrix of observations of K firms with n outputs (firm x input). |
NREP |
Number of bootstrap replications |
EFF |
Efficiencies for (X,Y) relative to the technology generated from (XREF,YREF). |
RTS |
The returns to scale assumptions as in |
ORIENTATION |
Input efficiency "in" (1), output efficiency "out" (2), and graph efficiency "graph" (3). |
alpha |
One minus the size of the confidence interval for the bias corrected efficiencies |
XREF |
Inputs of the firms determining the technology, defaults
to |
YREF |
Outputs of the firms determining the technology,
defaults to |
FRONT.IDX |
Index for firms determining the technology. |
EREF |
Efficiencies for the firms in XREF, YREF. |
DIRECT |
Does not yet work and is therefore not used. |
TRANSPOSE |
Input and output matrices are K x m and K x n for
the default value |
SHEPHARD.INPUT |
The bootstrap of the Farrell input efficiencies is done as a Shephard input distance function, the inverse Farrell input efficiency. The option is only relevant for input and graph directions. |
LP |
Only for debugging purposes. |
CONTROL |
Possible controls to lpSolveAPI, see the
documentation for that package. For examples of use see the
function |
The details are lightly explained in Bogetoft and Otto (2011) Chap. 6, and with more mathematical details in Dario and Simar (2007) Sect. 3.4 and in Simar and Wilson (1998).
The bootstrap at the moment does not work for any kind of directional efficiency.
The returned confidence intervals are for the bias corrected efficiencies; to get confidence intervals for the uncorrected efficiencies add the biases to both upper and lower values for the intervals.
Under the default option SHEPHARD.INPUT=TRUE
bias and bias
corrected efficiencies are calculated for Shephard input distance
function and then transformed to Farrell input efficiencies to
avoid possible negative biased corrected input efficiencies. If
this is not wanted use the option SHEPHARD.INPUT=FALSE
. This
option is only relevant for input and graph oriented directions.
The returned values from both functions are as follows:
eff |
Efficiencies |
eff.bc |
Bias-corrected efficiencies |
bias |
An array of bootstrap bias estimates for the K firms |
conf.int |
|
var |
An array of bootstrap variance estimates for the K firms |
boot |
The replica bootstrap estimates of the Farrell
efficiencies, a |
The function dea.boot
does not depend on the FEAR package
and can therefore be used on computers where the package FEAR is
not available. This, however, comes with a time penalty as it takes
around 4 times longer to run compared to using FEAR directly
The returned bootstrap estimates from FEAR::boot.sw98
of efficiencies are sorted for each firm individually.
Unfortunately, this means that the component of replicas is not the efficiencies for
the same bootstrap replica, but could easily be from different
bootstrap replicas. This also means that this function can
not be used to bootstrap tests for statistical hypotheses
where the statistics involves summing of firm's efficiencies.
If a numerical problem occurs, status=5, or if no solution can be found,
the best solution is often to scale the input X
and output
Y
yourself or use the option CONTROL
to change scaling in
the program itself, as described in the notes for dea
.
Peter Bogetoft and Lars Otto [email protected]
Peter Bogetoft and Lars Otto; Benchmarking with DEA, SFA, and R; Springer 2011.
Cinzia Dario and L. Simar; Advanced Robust and Nonparametric Methods in Efficiency Analysis. Methodology and Applications; Springer 2007.
Leopold Simar and Paul .W. Wilson (1998), “Sensitivity analysis of efficiency scores: How to bootstrap in nonparametric frontier models”, Management Science 44, 49–61.
Paul W. Wilson (2008), “FEAR 1.0: A Software Package for Frontier Efficiency Analysis with R,” Socio-Economic Planning Sciences 42, 247–254
The documentation for boot.sw98
in the package
FEAR.
x <- matrix(c(100,200,300,500,100,200,600),ncol=1) y <- matrix(c( 75,100,300,400, 25, 50,400),ncol=1) e <- dea(x,y) eff(e) dea.plot.frontier(x,y,txt=TRUE) # To bootstrap for real, NREP should be at least 2000. Run the # following lines a couple of times with nrep=100 and see how the # bootstrap frontier changes from one run to the next. Try the same # with NREP=2000 even though is does take a longer time to run, # especially for dea.boot. nrep <- 5 # nrep <- 2000 # if ( "FEAR" %in% .packages(TRUE) ) { ## The following only works if the package FEAR is installed; it does ## not have to be loaded. # b <- boot.fear(x,y, NREP=nrep) # } else { b <- dea.boot(x,y, NREP=nrep) # } # bias corrected frontier dea.plot.frontier(b$eff.bc*x, y, add=TRUE, lty="dashed") # outer 95% confidence interval frontier for uncorrected frontier dea.plot.frontier((b$conf.int[,1]+b$bias)*x, y, add=TRUE, lty="dotted") ## Test of hypothesis in DEA model # Null hypothesis is that technology is CRS and the alternative is VRS # Bogetoft and Otto (2011) pages 183--185. ec <- dea(x,y, RTS="crs") Ec <- eff(ec) ev <- dea(x,y, RTS="vrs") Ev <- eff(ev) # The test statistic; equation (6.1) S <- sum(Ec)/sum(Ev) # To calculate CRS and VRS efficiencies in the same bootstrap replicas # we reset the random number generator before each call of the # function dea.boot. # To get the an initial value for the random number generating process # we save its state (seed) save.seed <- sample.int(1e9,1) # The bootstrap and calculate CRS and VRS under the assumption that # the true technology is CRS (the null hypothesis) and such that the # results corresponds to the case where CRS and VRS are calculated for # the same reference set of firms; to make this happen we set the # random number generator to the same state before the calls. set.seed(save.seed) bc <- dea.boot(x,y, nrep,, RTS="crs") set.seed(save.seed) bv <- dea.boot(x,y, nrep,, RTS="vrs", XREF=x,YREF=y, EREF=ec$eff) # Calculate the statistic for each bootstrap replica bs <- colSums(bc$boot)/colSums(bv$boot) # The critical value for the test (default size \code{alpha} of test is 5%) critValue(bs, alpha=.1) S # Accept the hypothesis at 10% level? critValue(bs, alpha=.1) <= S # The probability of observing a smaller value of S when the # hypothesis is true; the p--value. typeIerror(S, bs) # Accept the hypothesis at size level 10%? typeIerror(S, bs) >= .10
x <- matrix(c(100,200,300,500,100,200,600),ncol=1) y <- matrix(c( 75,100,300,400, 25, 50,400),ncol=1) e <- dea(x,y) eff(e) dea.plot.frontier(x,y,txt=TRUE) # To bootstrap for real, NREP should be at least 2000. Run the # following lines a couple of times with nrep=100 and see how the # bootstrap frontier changes from one run to the next. Try the same # with NREP=2000 even though is does take a longer time to run, # especially for dea.boot. nrep <- 5 # nrep <- 2000 # if ( "FEAR" %in% .packages(TRUE) ) { ## The following only works if the package FEAR is installed; it does ## not have to be loaded. # b <- boot.fear(x,y, NREP=nrep) # } else { b <- dea.boot(x,y, NREP=nrep) # } # bias corrected frontier dea.plot.frontier(b$eff.bc*x, y, add=TRUE, lty="dashed") # outer 95% confidence interval frontier for uncorrected frontier dea.plot.frontier((b$conf.int[,1]+b$bias)*x, y, add=TRUE, lty="dotted") ## Test of hypothesis in DEA model # Null hypothesis is that technology is CRS and the alternative is VRS # Bogetoft and Otto (2011) pages 183--185. ec <- dea(x,y, RTS="crs") Ec <- eff(ec) ev <- dea(x,y, RTS="vrs") Ev <- eff(ev) # The test statistic; equation (6.1) S <- sum(Ec)/sum(Ev) # To calculate CRS and VRS efficiencies in the same bootstrap replicas # we reset the random number generator before each call of the # function dea.boot. # To get the an initial value for the random number generating process # we save its state (seed) save.seed <- sample.int(1e9,1) # The bootstrap and calculate CRS and VRS under the assumption that # the true technology is CRS (the null hypothesis) and such that the # results corresponds to the case where CRS and VRS are calculated for # the same reference set of firms; to make this happen we set the # random number generator to the same state before the calls. set.seed(save.seed) bc <- dea.boot(x,y, nrep,, RTS="crs") set.seed(save.seed) bv <- dea.boot(x,y, nrep,, RTS="vrs", XREF=x,YREF=y, EREF=ec$eff) # Calculate the statistic for each bootstrap replica bs <- colSums(bc$boot)/colSums(bv$boot) # The critical value for the test (default size \code{alpha} of test is 5%) critValue(bs, alpha=.1) S # Accept the hypothesis at 10% level? critValue(bs, alpha=.1) <= S # The probability of observing a smaller value of S when the # hypothesis is true; the p--value. typeIerror(S, bs) # Accept the hypothesis at size level 10%? typeIerror(S, bs) >= .10
Directional efficiency rescaled to an interpretation a la Farrell efficiency and the corresponding peer importance (lambda).
dea.direct(X, Y, DIRECT, RTS = "vrs", ORIENTATION = "in", XREF = NULL, YREF = NULL, FRONT.IDX = NULL, SLACK = FALSE, param=NULL, TRANSPOSE = FALSE)
dea.direct(X, Y, DIRECT, RTS = "vrs", ORIENTATION = "in", XREF = NULL, YREF = NULL, FRONT.IDX = NULL, SLACK = FALSE, param=NULL, TRANSPOSE = FALSE)
X |
Inputs of firms to be evaluated, a K x m matrix of observations of K firms with m inputs (firm x input) |
|||||||||||||||||||||
Y |
Outputs of firms to be evaluated, a K x n matrix of observations of K firms with n outputs (firm x input). |
|||||||||||||||||||||
DIRECT |
Directional efficiency, If the argument is a scalar, the direction is (1,1,...,1) times the scalar; the value of the efficiency depends on the scalar as well as on the unit of measurements. If the argument an array, this is used for the direction for
every firm; the length of the array must correspond to the
number of inputs and/or outputs depending on the
If the argument is a matrix then different directions are used
for each firm. The dimensions depends on the
|
|||||||||||||||||||||
RTS |
Text string or a number defining the underlying DEA technology / returns to scale assumption.
|
|||||||||||||||||||||
ORIENTATION |
Input efficiency "in" (1), output
efficiency "out" (2), and graph efficiency "graph" (3). For use
with |
|||||||||||||||||||||
XREF |
Inputs of the firms determining the technology, defaults
to |
|||||||||||||||||||||
YREF |
Outputs of the firms determining the technology,
defaults to |
|||||||||||||||||||||
FRONT.IDX |
Index for firms determining the technology. |
|||||||||||||||||||||
SLACK |
||||||||||||||||||||||
param |
Possible parameters. At the moment only used for
RTS="fdh+" to set low and high values for restrictions on lambda;
see the section details and examples in |
|||||||||||||||||||||
TRANSPOSE |
see |
When the argument DIRECT=d
is used then component
objval
of the returned object for input orientation is the
maximum value of e where for input orientation , and for
output orientation
are in the generated technology
set. The returned component
eff
is for input
and for output
to make the interpretation as for a
Farrell efficiency. Note that when the direction is not
proportional to
X
or Y
the returned eff
are
different for different inputs or outputs and eff
is a matrix
and not just an array. The directional efficiency can be restricted
to inputs (ORIENTATION="in"
), restricted to outputs
(ORIENTATION="out"
), or both include inputs and output
directions (ORIENTATION="in-out"
). Directional efficiency is
discussed on pages 31–35 and 121–127 in Bogetoft and Otto (2011).
The Farrell efficiency interpretation is the ratio by which a firm can proportionally reduce all inputs (or expand all outputs) without producing less outputs (using more inputs). The directional efficiencies have the same interpretation expect that the direction is not proportional to the inputs (or outputs) and therefore the different inputs may have different reduction ratios, the efficiency is an array and not just a number.
The results are returned in a Farrell object with the following
components. The method slack
only returns the three
components in the list relevant for slacks.
eff |
The Farrell efficiencies. Note that the efficiencies
are calculated to have the same interpretations as Farrell
efficiencies. |
lambda |
The lambdas, i.e. the weight of the peers, for each firm |
objval |
The objective value as returned from the LP program; the
|
RTS |
The return to scale assumption as in the option |
ORIENTATION |
The efficiency orientation as in the call |
TRANSPOSE |
As in the call |
slack |
A vector with sums of the slacks for each firm. Only
calculated in dea when option |
sx |
A matrix for input slacks for each firm, only calculated if
the option |
sy |
A matrix for output slack, see |
To handle fixed, non-discretionary inputs, one can let it
appear as negative output in an input-based mode, and reversely for
fixed, non-discretionary outputs. Fixed inputs (outputs) can also be
handled by directional efficiency; set the direction, the argument
DIRECT
, equal to the variable, discretionary inputs (outputs)
and 0 for the fixed inputs (outputs).
When the argument DIRECT=X
is used the then the returned
efficiency is equal to 1 minus the Farrell efficiency for input
orientation and equal to the Farrell efficiency minus 1 for output
orientation.
Peter Bogetoft and Lars Otto [email protected]
Directional efficiency is discussed on pages 31–35 and 121–127 in Bogetoft and Otto (2011).
Peter Bogetoft and Lars Otto; Benchmarking with DEA, SFA, and R; Springer 2011
# Directional efficiency x <- matrix(c(2,5 , 1,2 , 2,2 , 3,2 , 3,1 , 4,1), ncol=2,byrow=TRUE) y <- matrix(1,nrow=dim(x)[1]) dea.plot.isoquant(x[,1], x[,2],txt=1:dim(x)[1]) E <- dea(x,y) z <- c(1,1) e <- dea.direct(x,y,DIRECT=z) data.frame(Farrell=E$eff, Perform=e$eff, objval=e$objval) # The direction arrows(x[,1], x[,2], (x-z)[,1], (x-z)[,2], lty="dashed") # The efficiency (e$objval) along the direction segments(x[,1], x[,2], (x-e$objval*z)[,1], (x-e$objval*z)[,2], lwd=2) # Different directions x1 <- c(.5, 1, 2, 4, 3, 1) x2 <- c(4, 2, 1,.5, 2, 4) x <- cbind(x1,x2) y <- matrix(1,nrow=dim(x)[1]) dir1 <- c(1,.25) dir2 <- c(.25, 4) dir3 <- c(1,4) e <- dea(x,y) e1 <- dea.direct(x,y,DIRECT=dir1) e2 <- dea.direct(x,y,DIRECT=dir2) e3 <- dea.direct(x,y,DIRECT=dir3) data.frame(e=eff(e),e1=e1$eff,e2=e2$eff,e3=e3$eff)[6,] # Technology and directions for all firms dea.plot.isoquant(x[,1], x[,2],txt=1:dim(x)[1]) arrows(x[,1], x[,2], x[,1]-dir1[1], x[,2]-dir1[2],lty="dashed") segments(x[,1], x[,2], x[,1]-e1$objval*dir1[1], x[,2]-e1$objval*dir1[2],lwd=2) # slack for direction 1 dsl1 <- slack(x,y,e1) cbind(E=e$eff,e1$eff,dsl1$sx,dsl1$sy, sum=dsl1$sum) # Technology and directions for firm 6, # Figure 2.6 page 32 in Bogetoft & Otto (2011) dea.plot.isoquant(x1,x2,lwd=1.5, txt=TRUE) arrows(x[6,1], x[6,2], x[6,1]-dir1[1], x[6,2]-dir1[2],lty="dashed") arrows(x[6,1], x[6,2], x[6,1]-dir2[1], x[6,2]-dir2[2],lty="dashed") arrows(x[6,1], x[6,2], x[6,1]-dir3[1], x[6,2]-dir3[2],lty="dashed") segments(x[6,1], x[6,2], x[6,1]-e1$objval[6]*dir1[1], x[6,2]-e1$objval[6]*dir1[2],lwd=2) segments(x[6,1], x[6,2], x[6,1]-e2$objval[6]*dir2[1], x[6,2]-e2$objval[6]*dir2[2],lwd=2) segments(x[6,1], x[6,2], x[6,1]-e3$objval[6]*dir3[1], x[6,2]-e3$objval[6]*dir3[2],lwd=2)
# Directional efficiency x <- matrix(c(2,5 , 1,2 , 2,2 , 3,2 , 3,1 , 4,1), ncol=2,byrow=TRUE) y <- matrix(1,nrow=dim(x)[1]) dea.plot.isoquant(x[,1], x[,2],txt=1:dim(x)[1]) E <- dea(x,y) z <- c(1,1) e <- dea.direct(x,y,DIRECT=z) data.frame(Farrell=E$eff, Perform=e$eff, objval=e$objval) # The direction arrows(x[,1], x[,2], (x-z)[,1], (x-z)[,2], lty="dashed") # The efficiency (e$objval) along the direction segments(x[,1], x[,2], (x-e$objval*z)[,1], (x-e$objval*z)[,2], lwd=2) # Different directions x1 <- c(.5, 1, 2, 4, 3, 1) x2 <- c(4, 2, 1,.5, 2, 4) x <- cbind(x1,x2) y <- matrix(1,nrow=dim(x)[1]) dir1 <- c(1,.25) dir2 <- c(.25, 4) dir3 <- c(1,4) e <- dea(x,y) e1 <- dea.direct(x,y,DIRECT=dir1) e2 <- dea.direct(x,y,DIRECT=dir2) e3 <- dea.direct(x,y,DIRECT=dir3) data.frame(e=eff(e),e1=e1$eff,e2=e2$eff,e3=e3$eff)[6,] # Technology and directions for all firms dea.plot.isoquant(x[,1], x[,2],txt=1:dim(x)[1]) arrows(x[,1], x[,2], x[,1]-dir1[1], x[,2]-dir1[2],lty="dashed") segments(x[,1], x[,2], x[,1]-e1$objval*dir1[1], x[,2]-e1$objval*dir1[2],lwd=2) # slack for direction 1 dsl1 <- slack(x,y,e1) cbind(E=e$eff,e1$eff,dsl1$sx,dsl1$sy, sum=dsl1$sum) # Technology and directions for firm 6, # Figure 2.6 page 32 in Bogetoft & Otto (2011) dea.plot.isoquant(x1,x2,lwd=1.5, txt=TRUE) arrows(x[6,1], x[6,2], x[6,1]-dir1[1], x[6,2]-dir1[2],lty="dashed") arrows(x[6,1], x[6,2], x[6,1]-dir2[1], x[6,2]-dir2[2],lty="dashed") arrows(x[6,1], x[6,2], x[6,1]-dir3[1], x[6,2]-dir3[2],lty="dashed") segments(x[6,1], x[6,2], x[6,1]-e1$objval[6]*dir1[1], x[6,2]-e1$objval[6]*dir1[2],lwd=2) segments(x[6,1], x[6,2], x[6,1]-e2$objval[6]*dir2[1], x[6,2]-e2$objval[6]*dir2[2],lwd=2) segments(x[6,1], x[6,2], x[6,1]-e3$objval[6]*dir3[1], x[6,2]-e3$objval[6]*dir3[2],lwd=2)
Solution of dual DEA models, possibly with partial value information given as restrictions on the ratios (assurance regions)
dea.dual(X, Y, RTS = "vrs", ORIENTATION = "in", XREF = NULL, YREF = NULL, FRONT.IDX = NULL, DUAL = NULL, DIRECT=NULL, TRANSPOSE = FALSE, LP = FALSE, CONTROL=NULL, LPK=NULL)
dea.dual(X, Y, RTS = "vrs", ORIENTATION = "in", XREF = NULL, YREF = NULL, FRONT.IDX = NULL, DUAL = NULL, DIRECT=NULL, TRANSPOSE = FALSE, LP = FALSE, CONTROL=NULL, LPK=NULL)
X |
Inputs of firms to be evaluated, a K x m matrix of
observations of K firms with m inputs (firm x input). In case
|
||||||||||||
Y |
Outputs of firms to be evaluated, a K x n matrix of
observations of K firms with n outputs (firm x input). In case
|
||||||||||||
RTS |
A text string or a number defining the underlying DEA technology / returns to scale assumption.
|
||||||||||||
ORIENTATION |
Input efficiency "in" (1), output efficiency "out"
(2), and graph efficiency "graph" (3) (not yet implemented). For
use with |
||||||||||||
XREF |
Input of the firms determining the technology, defaults to
|
||||||||||||
YREF |
Output of the firms determining the technology, defaults
to |
||||||||||||
FRONT.IDX |
Index for firms determining the technology |
||||||||||||
DUAL |
Matrix of order “number of inputs plus number of outputs minus 2” times 2. The first column is the lower bound and the second column is the upper bound for the restrictions on the multiplier ratios. The ratios are relative to the first input and the first output, respectively. This implies that there is no restriction for neither the first input nor the first output so that the number of restrictions is two less than the total number of inputs and outputs. |
||||||||||||
DIRECT |
Directional efficiency, NB Not yet implemented |
||||||||||||
TRANSPOSE |
Input and output matrices are treated as firms
times goods for the default value |
||||||||||||
LP |
Only for debugging. If |
||||||||||||
CONTROL |
Possible controls to lpSolveAPI, see the documentation for that package. |
||||||||||||
LPK |
When |
Solved as an LP program using the package lpSolveAPI. The
method dea.dual.dea
calls the method dea
with the option
DUAL=TRUE
.
eff |
The efficiencies |
objval |
The objective value as returned from the LP problem, normally the same as eff |
RTS |
The return to scale assumption as in the option |
ORIENTATION |
The efficiency orientation as in the call |
TRANSPOSE |
As in the call |
u |
Dual values, prices, for inputs |
v |
Dual values, prices, for outputs |
gamma |
The values of gamma, the shadow price(s) for returns to scale restriction |
sol |
Solution of all variables as one component, sol=c(u,v,gamma). |
Note that the dual values are not unique for extreme points in the technology set. In this case the value of the calculated dual variable can depend on the order of the complete efficient firms.
If a numerical problem occurs, status=5, or if no solution can be found,
the best solution is often to scale the input X
and output
Y
yourself or use the option CONTROL
to change scaling in
the program itself, as described in the notes for dea
.
Peter Bogetoft and Lars Otto [email protected]
Bogetoft and Otto; Benchmarking with DEA, SFA, and R; Springer 2011. Sect. 5.10: Partial value information
x <- matrix(c(2,5 , 1,2 , 2,2 , 3,2 , 3,1 , 4,1), ncol=2,byrow=TRUE) y <- matrix(1,nrow=dim(x)[1]) dea.plot.isoquant(x[,1],x[,2],txt=1:dim(x)[1]) segments(0,0, x[,1], x[,2], lty="dotted") e <- dea(x,y,RTS="crs",SLACK=TRUE) ed <- dea.dual(x,y,RTS="crs") print(cbind("e"=e$eff,"ed"=ed$eff, peers(e), lambda(e), e$sx, e$sy, ed$u, ed$v), digits=3) dual <- matrix(c(.5, 2.5), nrow=dim(x)[2]+dim(y)[2]-2, ncol=2, byrow=TRUE) er <- dea.dual(x,y,RTS="crs", DUAL=dual) print(cbind("e"=e$eff,"ar"=er$eff, lambda(e), e$sx, e$sy, er$u, "ratio"=er$u[,2]/er$u[,1],er$v),digits=3)
x <- matrix(c(2,5 , 1,2 , 2,2 , 3,2 , 3,1 , 4,1), ncol=2,byrow=TRUE) y <- matrix(1,nrow=dim(x)[1]) dea.plot.isoquant(x[,1],x[,2],txt=1:dim(x)[1]) segments(0,0, x[,1], x[,2], lty="dotted") e <- dea(x,y,RTS="crs",SLACK=TRUE) ed <- dea.dual(x,y,RTS="crs") print(cbind("e"=e$eff,"ed"=ed$eff, peers(e), lambda(e), e$sx, e$sy, ed$u, ed$v), digits=3) dual <- matrix(c(.5, 2.5), nrow=dim(x)[2]+dim(y)[2]-2, ncol=2, byrow=TRUE) er <- dea.dual(x,y,RTS="crs", DUAL=dual) print(cbind("e"=e$eff,"ar"=er$eff, lambda(e), e$sx, e$sy, er$u, "ratio"=er$u[,2]/er$u[,1],er$v),digits=3)
Calculate and decompose potential gains from mergers of similar firms (horizontal integration).
dea.merge(X, Y, M, RTS = "vrs", ORIENTATION = "in", XREF = NULL, YREF = NULL, FRONT.IDX = NULL, TRANSPOSE=FALSE, CONTROL=NULL)
dea.merge(X, Y, M, RTS = "vrs", ORIENTATION = "in", XREF = NULL, YREF = NULL, FRONT.IDX = NULL, TRANSPOSE=FALSE, CONTROL=NULL)
X |
K times m matrix as in |
Y |
K times n matrix as in |
M |
Kg times K matrix where each row defines a merger by the
firms (columns) included in the matrix as returned from method
|
RTS |
as in |
ORIENTATION |
as in |
XREF |
as in |
YREF |
as in |
FRONT.IDX |
as in |
TRANSPOSE |
as in |
CONTROL |
Possible controls to lpSolveAPI, see the
documentation for that package. For examples of use see the
function |
The K firms are merged into Kg new, merged firms.
Most of the arguments correspond to the arguments in
dea
, with K firms, m inputs, and n outputs.
The decomposition is summarized on page 275 and in table 9.1 page 276 in Bogetoft and Otto (2011) and is based on Bogetoft and Wang (2005)
Eff |
Overall efficiencies of mergers, Kg vector |
Estar |
Adjusted overall efficiencies of mergers after the removal of individual learning, Kg vector |
learning |
Learning effects, Kg vector |
harmony |
Harmony (scope) effects, Kg vector |
size |
Size (scale) effects, Kg vector |
If a numerical problem occurs, status=5, or if no solution can be found,
the best solution is often to scale the input X
and output
Y
yourself or use the option CONTROL
to change scaling in
the program itself, as described in the notes for dea
.
Peter Bogetoft and Lars Otto [email protected]
Bogetoft and Otto; Benchmarking with DEA, SFA, and R; chapter 9; Springer 2011.
Bogetoft and Wang; “Estimating the Potential Gains from Mergers”; Journal of Productivity Ana-lysis, 23, pp. 145-171, 2005.
dea
and make.merge
x <- matrix(c(100,200,300,500),ncol=1,dimnames=list(LETTERS[1:4],"x")) y <- matrix(c(75,100,300,400),ncol=1,dimnames=list(LETTERS[1:4], "y")) dea.plot.frontier(x,y,RTS="vrs",txt=LETTERS[1:length(x)], xlim=c(0,1000),ylim=c(0,1000) ) dea.plot.frontier(x,y,RTS="drs", add=TRUE, lty="dashed", lwd=2) dea.plot.frontier(x,y,RTS="crs", add=TRUE, lty="dotted") dea(x,y,RTS="crs") M <- make.merge(list(c(1,2), c(3,4)), X=x) xmer <- M %*% x ymer <- M %*% y points(xmer,ymer,pch=8) text(xmer,ymer,labels=c("A+B","C+D"),pos=4) dea.merge(x,y,M, RTS="vrs") dea.merge(x,y,M, RTS="crs")
x <- matrix(c(100,200,300,500),ncol=1,dimnames=list(LETTERS[1:4],"x")) y <- matrix(c(75,100,300,400),ncol=1,dimnames=list(LETTERS[1:4], "y")) dea.plot.frontier(x,y,RTS="vrs",txt=LETTERS[1:length(x)], xlim=c(0,1000),ylim=c(0,1000) ) dea.plot.frontier(x,y,RTS="drs", add=TRUE, lty="dashed", lwd=2) dea.plot.frontier(x,y,RTS="crs", add=TRUE, lty="dotted") dea(x,y,RTS="crs") M <- make.merge(list(c(1,2), c(3,4)), X=x) xmer <- M %*% x ymer <- M %*% y points(xmer,ymer,pch=8) text(xmer,ymer,labels=c("A+B","C+D"),pos=4) dea.merge(x,y,M, RTS="vrs") dea.merge(x,y,M, RTS="crs")
Draw a graph of a DEA technology. Designed for two goods illustrations, either isoquant (2 inputs), transformation curve (2 outputs), or a production function (1 input and 1 output). If the number of good is larger than 2 then aggregation occur, either simple or weighted.
dea.plot(x, y, RTS="vrs", ORIENTATION="in-out", txt=NULL, add=FALSE, wx=NULL, wy=NULL, TRANSPOSE=FALSE, fex=1, GRID=FALSE, RANGE=FALSE, param=NULL, ..., xlim, ylim, xlab, ylab) dea.plot.frontier(x, y, RTS="vrs",...) dea.plot.isoquant(x1, x2, RTS="vrs",...) dea.plot.transform(y1, y2, RTS="vrs",...)
dea.plot(x, y, RTS="vrs", ORIENTATION="in-out", txt=NULL, add=FALSE, wx=NULL, wy=NULL, TRANSPOSE=FALSE, fex=1, GRID=FALSE, RANGE=FALSE, param=NULL, ..., xlim, ylim, xlab, ylab) dea.plot.frontier(x, y, RTS="vrs",...) dea.plot.isoquant(x1, x2, RTS="vrs",...) dea.plot.transform(y1, y2, RTS="vrs",...)
x |
The good illustrated on the first axis. If there are more
than 1 input then inputs are just summed or, if |
y |
The good illustrated on the second axis. If there are more
than 1 output then outputs are just summed or, if |
x1 , y1
|
The good illustrated on the first axis |
x2 , y2
|
The good illustrated on the second axis |
RTS |
Underlying DEA model / assumptions about returns to
scale: "fdh" (0), "vrs" (1), "drs" (2), "crs" (3), "irs" (4),
"irs2" (5) (irs without convexity), "add" (6), and "fdh+" (7).
Numbers in parenthesis can also be used as values for
|
ORIENTATION |
Input-output graph of 1 input and 1 output is "in-out" (0), graph of 2 inputs is "in" (1), and graph of 2 outputs is "out" (2). |
txt |
|
add |
For |
wx |
Weight to aggregate the first axis if there are more than 1 good behind the first axis. |
wy |
Weights to aggregate for the second axis if there are more than 1 good behind the second the second axis. |
TRANSPOSE |
Only relevant for more than 1 good for each axis,
see |
GRID |
If |
... |
Usual options for the methods |
fex |
Relative size of the text/labels on observations;
corresponds to |
RANGE |
A logical variable, if |
param |
Possible parameters. At the moment only used for
RTS="fdh+"; see the section details and examples for its
use. Future versions might also use |
xlim |
Possible limits |
ylim |
Possible limits |
xlab |
Possible label for the x-axis |
ylab |
Possible label for the y-axis |
The method dea.plot
is the general plotting
method. The the 3 others are specialized versions for frontiers (1
input and 1 output), isoquant curves (2 inputs for given outputs),
and transformation curves (2 outputs for given inputs) obtained by
using the argument ORIENTATION
.
The crs factor in RTS="fdh+" that sets the lower and upper bound
can be changed by the argument param
that will set the lower
and upper bound to 1-param and 1+param; the default value is
param=.15. The value must be greater than or equal to 0 and
strictly less than 1. A value of 0 corresponds to RTS="fdh". The
FDH+ technology set is described in Bogetoft and Otto (2011) pages
72–73.
No return, uses the original graphing system.
If there are more than 1 good for the arguments x
and
y
then the goods are just summed or, if wx
or
wy
are present, weighted sum of goods are used. In this
case the use of the command identify
must be called as
dea.plot(rowSums(x),rowSums(y))
.
Warning If you use this facility to plot multi input and multi output then the plot may deceive you as fully multi efficient firms are not necessarily placed on the two dimensional frontier.
Note that RTS="add"
and RTS="fdh+"
only works for
ORIENTATION="in-out"
(0).
Peter Bogetoft and Lars Otto [email protected]
Peter Bogetoft and Lars Otto; Benchmarking with DEA, SFA, and R; Springer 2011
Paul Murrell; R Graphics; Chapman & Hall 2006
The documentation for the function plot
and Murrell
(2006) for further options and on customizing plots.
x <- matrix(c(100,200,300,500,600,100),ncol=1) y <- matrix(c(75,100,300,400,400,50),ncol=1) dea.plot(x,y,RTS="vrs",ORIENTATION="in-out",txt=LETTERS[1:length(x)]) dea.plot(x,y,RTS="crs",ORIENTATION="in-out",add=TRUE,lty="dashed") dea.plot.frontier(x,y,txt=1:dim(x)[1]) n <- 10 x <- matrix(1:n,,1) y <- matrix(x^(1.6) + abs(rnorm(n)),,1) dea.plot.frontier(x,y,RTS="irs",txt=1:n) dea.plot.frontier(x,y,RTS="irs2",add=TRUE,lty="dotted") # Two different forms of irs: irs and irs2, and two different ways to # make a frontier id <- sample(1:n,30,replace=TRUE) dea.plot(x[id],y[id],RTS="irs",ORIENTATION="in-out") dea.plot.frontier(x[id],y[id],RTS="irs2") # Difference between the FDH technology and the additive # FRH technology x <- matrix(c(100,220,300,520,600,100),ncol=1) y <- matrix(c(75,100,300,400,400,50),ncol=1) dea.plot(x,y,RTS="fdh",ORIENTATION="in-out",txt=LETTERS[1:length(x)]) dea.plot(x,y,RTS="add",ORIENTATION="in-out",add=TRUE,lty="dashed",lwd=2) dea.plot(x,y,RTS="fdh+",ORIENTATION="in-out",add=TRUE, lty="dotted",lwd=3,col="red") # Use of parameter in FDH+ dea.plot(x,y,RTS="fdh",ORIENTATION="in-out",txt=LETTERS[1:length(x)]) dea.plot(x,y,RTS="fdh+",ORIENTATION="in-out",add=TRUE,lty="dashed") dea.plot(x,y,RTS="fdh+",ORIENTATION="in-out",add=TRUE,lty="dotted",param=.5)
x <- matrix(c(100,200,300,500,600,100),ncol=1) y <- matrix(c(75,100,300,400,400,50),ncol=1) dea.plot(x,y,RTS="vrs",ORIENTATION="in-out",txt=LETTERS[1:length(x)]) dea.plot(x,y,RTS="crs",ORIENTATION="in-out",add=TRUE,lty="dashed") dea.plot.frontier(x,y,txt=1:dim(x)[1]) n <- 10 x <- matrix(1:n,,1) y <- matrix(x^(1.6) + abs(rnorm(n)),,1) dea.plot.frontier(x,y,RTS="irs",txt=1:n) dea.plot.frontier(x,y,RTS="irs2",add=TRUE,lty="dotted") # Two different forms of irs: irs and irs2, and two different ways to # make a frontier id <- sample(1:n,30,replace=TRUE) dea.plot(x[id],y[id],RTS="irs",ORIENTATION="in-out") dea.plot.frontier(x[id],y[id],RTS="irs2") # Difference between the FDH technology and the additive # FRH technology x <- matrix(c(100,220,300,520,600,100),ncol=1) y <- matrix(c(75,100,300,400,400,50),ncol=1) dea.plot(x,y,RTS="fdh",ORIENTATION="in-out",txt=LETTERS[1:length(x)]) dea.plot(x,y,RTS="add",ORIENTATION="in-out",add=TRUE,lty="dashed",lwd=2) dea.plot(x,y,RTS="fdh+",ORIENTATION="in-out",add=TRUE, lty="dotted",lwd=3,col="red") # Use of parameter in FDH+ dea.plot(x,y,RTS="fdh",ORIENTATION="in-out",txt=LETTERS[1:length(x)]) dea.plot(x,y,RTS="fdh+",ORIENTATION="in-out",add=TRUE,lty="dashed") dea.plot(x,y,RTS="fdh+",ORIENTATION="in-out",add=TRUE,lty="dotted",param=.5)
Calculate efficiencies for Farrell and sfa object. For a sfa there are several types
eff( object, ... ) efficiencies( object, ... ) ## Default S3 method: efficiencies( object, ... ) ## S3 method for class 'Farrell' efficiencies(object, type = "Farrell", ...) ## S3 method for class 'Farrell' eff(object, type = "Farrell", ...) ## S3 method for class 'sfa' efficiencies(object, type = "BC", ...) ## S3 method for class 'sfa' eff(object, type = "BC", ...)
eff( object, ... ) efficiencies( object, ... ) ## Default S3 method: efficiencies( object, ... ) ## S3 method for class 'Farrell' efficiencies(object, type = "Farrell", ...) ## S3 method for class 'Farrell' eff(object, type = "Farrell", ...) ## S3 method for class 'sfa' efficiencies(object, type = "BC", ...) ## S3 method for class 'sfa' eff(object, type = "BC", ...)
object |
A Farrell object returned from a DEA function like dea, sdea, or mea or an sfa object returned from the function sfa. |
type |
The type of efficiencies to be calculated. For a Farrell object the possibilities are “Farrell” efficiency or “Shephard” efficiency. For a sfa object the possibilities are “BC”, “Mode”, “J”, or “add”. |
... |
Further arguments ... |
The possible types for class Farrell
(an object returned from
dea
et al. are “Farrell” and “Shephard”.
The possible types for class sfa
efficiencies are
Efficiencies estimated by minimizing the mean square error; Eq. (7.21) in Bogetoft and Otto (2011, 219) and Battese and Coelli (1988, 392)
Efficiencies estimates using the conditional mode approach; Bogetoft and Otto (2011, 219), Jondrow et al. (1982, 235).
Efficiencies estimates using the conditional mean approach Jondrow et al. (1982, 235).
Efficiency in the additive model, Bogetoft and Otto (2011, 219)
The efficiencies are returned as an array.
For the Farrell object the orientation is determined by the calculations that led to the object and cannot be changed here.
Peter Bogetoft and Lars Otto [email protected]
Bogetoft and Otto; Benchmarking with DEA, SFA, and R, Springer 2011
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets.
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets.
A method to estimate and plot kernel estimate of (Farrell) efficiencies taken into consideration that efficiencies are bounded either above (input direction) or below (output direction).
eff.dens(eff, bw = "nrd0") eff.dens.plot(obj, bw = "nrd0", ..., xlim, ylim, xlab, ylab)
eff.dens(eff, bw = "nrd0") eff.dens.plot(obj, bw = "nrd0", ..., xlim, ylim, xlab, ylab)
eff |
Either a list of (Farrell) efficiencies or a Farrell
object returned from the method |
bw |
Bandwith, look at the documentation of |
obj |
Either an array of efficiencies or a list returned from
|
... |
Further arguments to the |
xlim |
Range on the x-axis; usually not needed, just use the defaults. |
ylim |
Range on the x-axis; usually not needed, just use the defaults. |
xlab |
Label for the x-axis. |
ylab |
Label for the y-axis. |
The calculation is based on a reflection method (Silverman
1986, 30) using the default window kernel and default bandwidth (window
width) in the method density
.
The method eff.dens.plot
plot the density directly, and
eff.dens
just estimate the numerical density, and the result
can then either be plotted by plot
, corresponds to
eff.dens.plot
, or by lines as an overlay on an existing plot.
The return from eff.dens
is a list list(x,y)
with efficiencies and the corresponding density values.
The input efficiency is also bounded below by 0, but for normal firms an efficiency at 0 will not happen, i.e. the boundary is not effective, and therefore this boundary is not taken into consideration.
Peter Bogetoft and Lars Otto [email protected]
B.W. Silverman (1986), Density Estimation for Statistics and Data Analysis, Chapman and Hall, London.
e <- 1 - rnorm(100) e[e>1] <- 1 e <- e[e>0] eff.dens.plot(e) hist(e, breaks=15, freq=FALSE, xlab="Efficiency", main="") den <- eff.dens(e) lines(den,lw=2)
e <- 1 - rnorm(100) e[e>1] <- 1 e <- e[e>0] eff.dens.plot(e) hist(e, breaks=15, freq=FALSE, xlab="Efficiency", main="") den <- eff.dens(e) lines(den,lw=2)
How the efficiency changes as the most influential peer is
removed sequentially one at a time. For eladder
the removed peer it
is the one that have the largest change in efficiency when removed and for
eladder2
it is the peer with the largest weight (lambda).
eladder(n, X, Y, RTS="vrs", ORIENTATION="in", XREF=NULL, YREF=NULL, DIRECT=NULL, param=NULL, MAXELAD=NULL) eladder2(n, X, Y, RTS = "vrs", ORIENTATION = "in", XREF=NULL, YREF=NULL, DIRECT = NULL, param=NULL, MAXELAD=NULL) eladder.plot(elad, peer, TRIM = NULL, xlab="Most influential peers", ylab="Efficiency", ...)
eladder(n, X, Y, RTS="vrs", ORIENTATION="in", XREF=NULL, YREF=NULL, DIRECT=NULL, param=NULL, MAXELAD=NULL) eladder2(n, X, Y, RTS = "vrs", ORIENTATION = "in", XREF=NULL, YREF=NULL, DIRECT = NULL, param=NULL, MAXELAD=NULL) eladder.plot(elad, peer, TRIM = NULL, xlab="Most influential peers", ylab="Efficiency", ...)
n |
The number of the firm where the ladder is calculated |
X |
Inputs of firms to be evaluated, a K x m matrix of
observations of K firms with m inputs (firm x input). In case
|
Y |
Outputs of firms to be evaluated, a K x n matrix of
observations of K firms with n outputs (firm x input). In case
|
RTS |
Text string or a number defining the underlying DEA
technology / returns to scale assumption, se the possible values
for |
ORIENTATION |
Input efficiency "in" (1), output efficiency
"out" (2), and graph efficiency "graph" (3). For use with
|
XREF |
Inputs of the firms determining the technology, defaults
to |
YREF |
Outputs of the firms determining the technology, defaults
to |
DIRECT |
Directional efficiency, |
param |
Possible parameters. Now only used for
RTS="fdh+" to set low and high values for restrictions on lambda;
see the section details and examples in |
MAXELAD |
The maximum number of influential peers to remove. |
elad |
The sequence of efficiencies returned from |
peer |
The sequence of peers returned from |
TRIM |
The number of characters for the name of the peers on the axis in the plot. |
xlab |
A title for the x axis |
ylab |
A title for the y axis |
... |
Usual options for the method |
The function eladder
calculates how the efficiency for a
firm changes when the most influential peer is removed sequentially one
at a time. For eladder
the largest effect is the largest change
in efficiency and for eladder2
the largest weight, lambda.
Somewhere in the sequence the firm becomes efficient and are itself removed from the set of firms generating the technology (or the only firm left) and thereafter the efficiencies are super-efficiencies and the process stops.
When it happens that there is no solution to the dea problem after
removing a series of peers then the program might stop before
MAXELAD
peers have been removed.
The object returned from eladder
is a list with components
eff |
The sequence of efficiencies when the peer with the largest value of lambda has been removed. |
peer |
The sequence of removed peers corresponding to the largest
values of lambda as index in the |
When the number of firms is large then the number of influential peers will also be large and the names or numbers of the peers on the x-axis might be squeeze together and be illegible. In this case restrict the number of influential peers to be removed.
The efficiency step ladder is discussed in Essay III of Dag Fjeld Edvardsen's Ph.D. thesis from 2004.
Peter Bogetoft and Lars Otto [email protected]
Dag Fjeld Edvardsen; Four Essays on the Measurement of Productive Efficiency; University of Gothenburg 2004; http://hdl.handle.net/2077/2923
data(charnes1981) x <- with(charnes1981, cbind(x1,x2,x3,x4,x5)) y <- with(charnes1981, cbind(y1,y2,y3)) # Choose the firm for analysis, we choose 'Tacoma' n <- which(charnes1981$name=="Tacoma")[1] el <- eladder(n, x, y, RTS="crs") eladder.plot(el$eff, el$peer) # Restrict to 20 most influential peers for 'Tacoma' and use names # instead of number eladder.plot(el$eff[1:20], charnes1981$name[el$peer][1:20]) # Truncate the names of the peers and put a title on top eladder.plot(el$eff[1:20], charnes1981$name[el$peer][1:20], TRIM=5) title("Eladder for Tacoma")
data(charnes1981) x <- with(charnes1981, cbind(x1,x2,x3,x4,x5)) y <- with(charnes1981, cbind(y1,y2,y3)) # Choose the firm for analysis, we choose 'Tacoma' n <- which(charnes1981$name=="Tacoma")[1] el <- eladder(n, x, y, RTS="crs") eladder.plot(el$eff, el$peer) # Restrict to 20 most influential peers for 'Tacoma' and use names # instead of number eladder.plot(el$eff[1:20], charnes1981$name[el$peer][1:20]) # Truncate the names of the peers and put a title on top eladder.plot(el$eff[1:20], charnes1981$name[el$peer][1:20], TRIM=5) title("Eladder for Tacoma")
Excess input compared over frontier input and/or less output than frontier/transformation/optimal output.
excess(object, X = NULL, Y = NULL)
excess(object, X = NULL, Y = NULL)
object |
A Farrell object as returned from functions like dea, dea.direct, sdea, and mea. |
X |
Input matrix, only necessary for ordinary input Farrell efficiency |
Y |
Ouput matrix , only necessary for ordinary output Farrell efficiency |
For Farrell input efficiency E the exess input is and
for Farrell ouput efficiency F the missing output is
.
Notice that the excess calculated does not include any slack values. In
case slacks are present and calculated it might be more appropriate
to add slack, i.e. to use excess(object, X, Y) + slack(X, Y, object)
.
For directional efficiency e in the direction D the excess input is
.
If a firm is outside the technology set, as could be the case when calculating super-efficiencies, the Farrell input efficiency is larger than 1, and then the excess values are negative.
Return a matrix with exces input and/or less output.
Peter Bogeroft and Lars Otto [email protected]
Peter Bogetoft and Lars Otto; Benchmarking with DEA, SFA, and R; Springer 2011
x <- matrix(c(100,200,300,500,100,200,600),ncol=1) y <- matrix(c(75,100,300,400,25,50,400),ncol=1) e <- dea(x,y) excess(e,x) x - eff(e) * x e <- dea(x,y, ORIENTATION="graph") excess(e, x, y) x - eff(e) * x 1/eff(e) * y -y me <- mea(x,y) excess(me)
x <- matrix(c(100,200,300,500,100,200,600),ncol=1) y <- matrix(c(75,100,300,400,25,50,400),ncol=1) e <- dea(x,y) excess(e,x) x - eff(e) * x e <- dea(x,y, ORIENTATION="graph") excess(e, x, y) x - eff(e) * x 1/eff(e) * y -y me <- mea(x,y) excess(me)
The lambdas, i.e. the weight of the peers, for each firm.
lambda(object, KEEPREF = FALSE) lambda.print(x, KEEPREF = FALSE, ...)
lambda(object, KEEPREF = FALSE) lambda.print(x, KEEPREF = FALSE, ...)
object , x
|
A Farrell object as returned from |
KEEPREF |
if |
... |
Optional parameters for the print method. |
Only returns the lambdas for firms that appear as a peer, i.e. only lambdas for firms where at least one element of the lambda is positive.
The return is a matrix with the firms as rows and the peers as columns.
Peter Bogetoft and Lars Otto [email protected]
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets.
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets.
Make an aggregation matrix to perform mergers of firms. The matrix can be post multiplied (matrix multiplication) to input and output matrices to make merged input and output matrices.
make.merge(grp, nFirm = NULL, X = NULL, names = NULL)
make.merge(grp, nFirm = NULL, X = NULL, names = NULL)
grp |
Either a list of length Kg for Kg firms after mergers; each component of the list is a (named) list with the firm numbers or names going into this merger. Or a factor of length K with Kg levels where each level determines a merger; to exclude firms for mergers set the factor value to NA. |
nFirm |
Number of firms before the mergers |
X |
A matrix of inputs or outputs where the rows corresponds to the number of original (starting) firms |
names |
A list with names of all firms, only needed if the
mergers are given as a list of names, i.e. |
Either nFirm
or X
must be present; if both are present
then nFirm
must be equal to the number of rows in X
, the
number of firms.
When X
is an input matrix of dimension K x m
, K
firms and m
inputs, and M <- make.merge(gr,K)
then
M %*% X
is the input matrix for the merged firms.
Returns an aggregation matrix of dimension Kg times K where rows corresponds to new merged firms and columns are 1 for firms to be included and 0 for firms to be excluded in the given merger as defined by the row.
The argument TRANSPOSE
has not been implemented for this
function. If you need transposed matrices you must transpose the
merger matrix yourself. If you define mergers via factors there is no
need to transpose in the arguments; just do not use X
in the
arguments.
Peter Bogetoft and Lars Otto [email protected]
# To merge firms 1,3, and 5; and 2 and 4 of 7 firms into 2 new firms # the aggregation matrix is M; not all firms are involved in a merger. M <- make.merge(list(c(1,3,5), c(2,4)),7) print(M) # Merge 1 and 2, and 4 and 5, and leave 3 alone, total of 5 firms. # Using a list M1 <- make.merge(list(c(1,2), c(4,5)), nFirm=5) print(M1) # Using a factor fgr <- factor(c("en","en",NA,"to","to")) M2 <- make.merge(fgr) print(M2) # Name of mergers M3 <- make.merge(list(AB=c("A","B"), DE=c("D","E")), names=LETTERS[1:5]) print(M3) # No name of mergers M4 <- make.merge(list(c("A","B"), c("D","E")), names=LETTERS[1:5]) print(M4)
# To merge firms 1,3, and 5; and 2 and 4 of 7 firms into 2 new firms # the aggregation matrix is M; not all firms are involved in a merger. M <- make.merge(list(c(1,3,5), c(2,4)),7) print(M) # Merge 1 and 2, and 4 and 5, and leave 3 alone, total of 5 firms. # Using a list M1 <- make.merge(list(c(1,2), c(4,5)), nFirm=5) print(M1) # Using a factor fgr <- factor(c("en","en",NA,"to","to")) M2 <- make.merge(fgr) print(M2) # Name of mergers M3 <- make.merge(list(AB=c("A","B"), DE=c("D","E")), names=LETTERS[1:5]) print(M3) # No name of mergers M4 <- make.merge(list(c("A","B"), c("D","E")), names=LETTERS[1:5]) print(M4)
Estimates Malmquist indices for productivity and its decomposition between two periods. The units in the two periods does not have to be exactly the same, but the Malmquist index is only calculated for units present in both periods.
malmq(X0, Y0, ID0 = NULL, X1, Y1, ID1 = NULL, RTS = "vrs", ORIENTATION = "in", SAMEREF=FALSE, SLACK = FALSE, DUAL = FALSE, DIRECT = NULL, param = NULL, TRANSPOSE = FALSE, FAST = TRUE, LP = FALSE, CONTROL = NULL, LPK = NULL)
malmq(X0, Y0, ID0 = NULL, X1, Y1, ID1 = NULL, RTS = "vrs", ORIENTATION = "in", SAMEREF=FALSE, SLACK = FALSE, DUAL = FALSE, DIRECT = NULL, param = NULL, TRANSPOSE = FALSE, FAST = TRUE, LP = FALSE, CONTROL = NULL, LPK = NULL)
X0 |
Inputs of firms in period 0, a K0 x m matrix of observations of K0 firms with m inputs (firm x input). |
Y0 |
Outputs of firms in period 0, a K0 x n matrix of observations of K0 firms with n outputs (firm x input). |
ID0 |
Index for firms in period 0; could be numbers or labels. Length K0. |
X1 |
Inputs of firms in period 1, a K1 x m matrix of observations of K1 firms with m inputs (firm x input). |
Y1 |
Outputs of firms in period 1, a K1 x n matrix of observations of K1 firms with n outputs (firm x input). |
ID1 |
Index for firms in period 0; could be numbers or labels. Length K0. |
RTS |
Returns to scale assumption as in |
ORIENTATION |
Input efficiency "in" (1), output
efficiency "out" (2), and graph efficiency "graph" (3) as in |
SAMEREF |
Use the same units for reference technology when comparing two periods. This is not restricted to same units in several timpe periods, but only to pairwise periods comparisons for Malmquist. Default is to use available and possible differnt units in pairwise periods. |
SLACK |
See |
DUAL |
See |
DIRECT |
See |
param |
See |
TRANSPOSE |
See |
FAST |
See |
LP |
See |
CONTROL |
See |
LPK |
See |
The order of the units in values is given by the returned value id
.
This is usefull if the order of units differ completely between ID0
and ID1
.
The index for technical changes tc
is calculated as sqrt(e10/e11 * e00/e01)
where e<s><t>
is the efficiency for period s
when the reference technology is
for period t
, i.e. determined from the observations for period t
and
XREF=X_t, YREF=Y_t
, as is the option for the function dea
.
The Malmquist index for productivity mq
is calculates as sqrt(e10/e00 * e11/e01)
and the
index for change in efficiency ec
is e11/e00
. Note that mq = tc * ec
.
m |
Malmquist index for productivity. |
tc |
Index for technology change. |
ec |
Index for efficiency change. |
mq |
Malmquist index for productivity; same as |
id |
Index for firms present in both period 0 and period 1. |
id0 |
Index for firms in period 0 that are also in period 1. |
id1 |
Index for firms in period 1 that are also in period 0. |
e00 |
The efficiencies for period 0 with reference technology from period 0. |
e10 |
The efficiencies for period 1 with reference technology from period 0. |
e11 |
The efficiencies for period 1 with reference technology from period 1. |
e01 |
The efficiencies for period 0 with reference technology from period 1. |
The calculations of efficiencies are only done for units present in both periods.
Peter Bogetoft and Lars Otto [email protected]
Peter Bogetoft and Lars Otto; Benchmarking with DEA, SFA, and R; Springer 2011
x0 <- matrix(c(10, 28, 30, 60),ncol=1) y0 <- matrix(c(5, 7, 10, 15),ncol=1) x1 <- matrix(c(12, 26, 16, 60 ),ncol=1) y1 <- matrix(c(6, 8, 9, 15 ),ncol=1) dea.plot(x0, y0, RTS="vrs", txt=TRUE) dea.plot(x1, y1, RTS="vrs", add=TRUE, col="red") points(x1, y1, col="red", pch=16) text(x1, y1, 1:dim(x1)[1], col="red", adj=-1) m <- malmq(x0,y0,,x1,y1,,RTS="vrs") print("Malmquist index for change in productivity, technology change:") print(m$mq) print("Index for change of frontier:") print(m$tc)
x0 <- matrix(c(10, 28, 30, 60),ncol=1) y0 <- matrix(c(5, 7, 10, 15),ncol=1) x1 <- matrix(c(12, 26, 16, 60 ),ncol=1) y1 <- matrix(c(6, 8, 9, 15 ),ncol=1) dea.plot(x0, y0, RTS="vrs", txt=TRUE) dea.plot(x1, y1, RTS="vrs", add=TRUE, col="red") points(x1, y1, col="red", pch=16) text(x1, y1, 1:dim(x1)[1], col="red", adj=-1) m <- malmq(x0,y0,,x1,y1,,RTS="vrs") print("Malmquist index for change in productivity, technology change:") print(m$mq) print("Index for change of frontier:") print(m$tc)
Estimate Malmquist index for firms in a panel data set. The data set does not need to be balanced.
malmquist(X, Y, ID, TIME, RTS = "vrs", ORIENTATION = "in", SAMEREF=FALSE, SLACK = FALSE, DUAL = FALSE, DIRECT = NULL, param = NULL, TRANSPOSE = FALSE, FAST = TRUE, LP = FALSE, CONTROL = NULL, LPK = NULL)
malmquist(X, Y, ID, TIME, RTS = "vrs", ORIENTATION = "in", SAMEREF=FALSE, SLACK = FALSE, DUAL = FALSE, DIRECT = NULL, param = NULL, TRANSPOSE = FALSE, FAST = TRUE, LP = FALSE, CONTROL = NULL, LPK = NULL)
X |
Inputs of firms in many periods, a (T*K) x m matrix of observations of K firms with m outputs (firm x input) in at the most T periods. |
Y |
Outputs of firms in many periods, a (T*K) x n matrix of observations of K0 firms with n outputs (firm x input) in at the most T periods. |
ID |
Identifier for the firms in rows of |
TIME |
Array with period number for each row in the input maxtrix |
RTS |
Returns to scale assumption as in |
ORIENTATION |
Input efficiency "in" (1), output
efficiency "out" (2), and graph efficiency "graph" (3) as in |
SAMEREF |
Use the same units for reference technology when comparing two periods. This is not restricted to same units in several timpe periods, but only to pairwise periods comparisons for Malmquist. Default is to use available and possible differnt units in pairwise periods. |
SLACK |
See |
DUAL |
See |
DIRECT |
See |
param |
See |
TRANSPOSE |
See |
FAST |
See |
LP |
See |
CONTROL |
See |
LPK |
See |
Malmquist
uses malmq
for the calculations of the
necessary efficiencies, and the returned indices are as in malmq
.
The data must be a long data set with regards to TIME
and ID
; se the example below.
Note that the calculated index are index comparing a period and the previous period. To compare the development over time the indices must be turned into a chain index as shown in the example below.
m |
Malmquist indicies, an array of length T*K in the order of |
tc |
Technical change indices, an array of length T*K. |
ec |
Efficiency indices, an array of length T*K. |
id |
Index for firms as |
time |
Index for time as |
e00 |
The efficiencies for period 0 with reference technology from period 0. |
e10 |
The efficiencies for period 1 with reference technology from period 0. |
e11 |
The efficiencies for period 1 with reference technology from period 1. |
e01 |
The efficiencies for period 0 with reference technology from period 1. |
The lagged values e11
are not necessary equal to values of e00
as the reference technology for the two periods could be generated by
different units, if the units in different time periods are not the same.
Peter Bogetoft and Lars Otto [email protected]
Peter Bogetoft and Lars Otto; Benchmarking with DEA, SFA, and R; Springer 2011
x0 <- matrix(c(10, 28, 30, 60),ncol=1) y0 <- matrix(c(5, 7, 10, 15),ncol=1) x1 <- matrix(c(12, 26, 16, 60 ),ncol=1) y1 <- matrix(c(6, 8, 9, 15 ),ncol=1) x2 <- matrix(c(13, 26, 15, 60 ),ncol=1) y2 <- matrix(c(7, 9, 10, 15 ),ncol=1) dea.plot(x0, y0, RTS="vrs", txt=TRUE) dea.plot(x1, y1, RTS="vrs", add=TRUE, col="red") dea.plot(x2, y2, RTS="vrs", add=TRUE, col="blue") points(x1, y1, col="red", pch=16) # points(x2, y2, col="blue", pch=17) text(x1, y1, 1:dim(x1)[1], col="red", adj=-1) text(x2, y2, 1:dim(x1)[1], col="blue", adj=-1) legend("bottomright", legend=c("Period 0", "Period 1", "Period 2"), col=c("black", "red", "blue"), lty=1, pch=c(1,16, 17), bty="n") X <- rbind(x0, x1, x2) Y <- rbind(y0, y1, y2) # Make ID and TIME variables one way or another ID <- rep(1:dim(x1)[1], 3) # TIME <- c(rep(0,dim(x1)[1]), rep(1,dim(x1)[1]), rep(2,dim(x1)[1])) TIME <- gl(3, dim(x1)[1], labels=0:2) # This is how the data for Malmquist must look like data.frame(TIME, ID, X, Y) mq <- malmquist(X,Y, ID, TIME=TIME) data.frame(TIME, ID, X, Y, mq$e00, mq$e01, mq$e10, mq$e11, mq$m, mq$tc)[order(ID, TIME),] # How to make the Malmquist indices to a chain index # Make data.frame with indices DM <- data.frame(TIME, ID, m=mq$m, tc=mq$tc, ec=mq$ec) # Set missing index for first period to 1, the base DM[DM$TIME==0, c("m","tc", "ec")] <- 1 # Make chain index of the individual indices AD <- aggregate(cbind(m=DM$m), by=list(ID=DM$ID), cumprod) # Compare chain index to original index data.frame(ID, TIME, m=c(AD$m), DM$m)
x0 <- matrix(c(10, 28, 30, 60),ncol=1) y0 <- matrix(c(5, 7, 10, 15),ncol=1) x1 <- matrix(c(12, 26, 16, 60 ),ncol=1) y1 <- matrix(c(6, 8, 9, 15 ),ncol=1) x2 <- matrix(c(13, 26, 15, 60 ),ncol=1) y2 <- matrix(c(7, 9, 10, 15 ),ncol=1) dea.plot(x0, y0, RTS="vrs", txt=TRUE) dea.plot(x1, y1, RTS="vrs", add=TRUE, col="red") dea.plot(x2, y2, RTS="vrs", add=TRUE, col="blue") points(x1, y1, col="red", pch=16) # points(x2, y2, col="blue", pch=17) text(x1, y1, 1:dim(x1)[1], col="red", adj=-1) text(x2, y2, 1:dim(x1)[1], col="blue", adj=-1) legend("bottomright", legend=c("Period 0", "Period 1", "Period 2"), col=c("black", "red", "blue"), lty=1, pch=c(1,16, 17), bty="n") X <- rbind(x0, x1, x2) Y <- rbind(y0, y1, y2) # Make ID and TIME variables one way or another ID <- rep(1:dim(x1)[1], 3) # TIME <- c(rep(0,dim(x1)[1]), rep(1,dim(x1)[1]), rep(2,dim(x1)[1])) TIME <- gl(3, dim(x1)[1], labels=0:2) # This is how the data for Malmquist must look like data.frame(TIME, ID, X, Y) mq <- malmquist(X,Y, ID, TIME=TIME) data.frame(TIME, ID, X, Y, mq$e00, mq$e01, mq$e10, mq$e11, mq$m, mq$tc)[order(ID, TIME),] # How to make the Malmquist indices to a chain index # Make data.frame with indices DM <- data.frame(TIME, ID, m=mq$m, tc=mq$tc, ec=mq$ec) # Set missing index for first period to 1, the base DM[DM$TIME==0, c("m","tc", "ec")] <- 1 # Make chain index of the individual indices AD <- aggregate(cbind(m=DM$m), by=list(ID=DM$ID), cumprod) # Compare chain index to original index data.frame(ID, TIME, m=c(AD$m), DM$m)
Potential improvements PI or multi-directional efficiency analysis. The result is an excess value measures by the direction.
The direction is determined by the direction corresponding to the minimum input/maximum direction each good can be changed when they are changed one at a time.
mea(X, Y, RTS = "vrs", ORIENTATION = "in", XREF = NULL, YREF = NULL, FRONT.IDX = NULL, param=NULL, TRANSPOSE = FALSE, LP = FALSE, CONTROL = NULL, LPK = NULL) mea.lines(N, X, Y, ORIENTATION="in")
mea(X, Y, RTS = "vrs", ORIENTATION = "in", XREF = NULL, YREF = NULL, FRONT.IDX = NULL, param=NULL, TRANSPOSE = FALSE, LP = FALSE, CONTROL = NULL, LPK = NULL) mea.lines(N, X, Y, ORIENTATION="in")
X |
K times m matrix with K firms and m inputs as in |
|||||||||||||||||||||
Y |
K times n matrix with K firms and n outputs as in |
|||||||||||||||||||||
RTS |
Text string or a number defining the underlying DEA technology / returns to scale assumption.
|
|||||||||||||||||||||
ORIENTATION |
Input efficiency "in" (1) or output efficiency "out" (2), and also the additional option "in-out" (0) for both input and output direction. |
|||||||||||||||||||||
XREF |
Inputs of the firms determining the technology, defaults
to |
|||||||||||||||||||||
YREF |
Outputs of the firms determining the technology,
defaults to |
|||||||||||||||||||||
FRONT.IDX |
Index for firms determining the technology |
|||||||||||||||||||||
param |
Possible parameters. At the moment only used for
RTS="fdh+" to set low and high values for restrictions on lambda;
see the section details and examples in |
|||||||||||||||||||||
TRANSPOSE |
as in |
|||||||||||||||||||||
LP |
as in |
|||||||||||||||||||||
CONTROL |
as in |
|||||||||||||||||||||
LPK |
as in |
|||||||||||||||||||||
N |
Number of firms where directional lines are to be drawn on an already existing frontier plot (dea.plot.frontier) |
Details can be found in Bogetoft and Otto (2011, 121–124).
This method is for input directional efficiency only interesting when there are 2 or more inputs, and for output only when there are 2 or more outputs.
The results are returned in a Farrell object with the following components.
eff |
Excess value in DIRECT units of measurement, this is not Farrell efficiency |
lambda |
The lambdas, i.e. the weight of the peers, for each firm |
objval |
The objective value as returned from the LP program, normally the same as eff |
RTS |
The return to scale assumption as in the option |
ORIENTATION |
The efficiency orientation as in the call |
direct |
A K times m|n|m+n matrix with directions for each firm: the number of columns depends on whether it is input, output or in-out orientated. |
TRANSPOSE |
As in the call |
The calculation is done in dea
after a
calculation of the direction that then is used in the argument
DIRECT
. The calculation of the direction is done in a series
LP programs, one for each good in the direction.
Peter Bogetoft and Lars Otto [email protected]
Peter Bogetoft and Lars Otto; Benchmarking with DEA, SFA, and R; Springer 2011
dea
and the argument DIRECT
.
X <- matrix(c(2, 2, 5, 10, 10, 3, 12, 8, 5, 4, 6,12), ncol=2) Y <- matrix(rep(1,dim(X)[1]), ncol=1) dea.plot.isoquant(X[,1], X[,2],txt=1:dim(X)[1]) mea.lines(c(5,6),X,Y) me <- mea(X,Y) me peers(me) # MEA potential saving in inputs, exces inputs eff(me) * me$direct me$eff * me$direct # Compare to traditionally Farrell efficiency e <- dea(X,Y) e peers(e) # Farrell potential saving in inputs, excess inputs (1-eff(e)) * X
X <- matrix(c(2, 2, 5, 10, 10, 3, 12, 8, 5, 4, 6,12), ncol=2) Y <- matrix(rep(1,dim(X)[1]), ncol=1) dea.plot.isoquant(X[,1], X[,2],txt=1:dim(X)[1]) mea.lines(c(5,6),X,Y) me <- mea(X,Y) me peers(me) # MEA potential saving in inputs, exces inputs eff(me) * me$direct me$eff * me$direct # Compare to traditionally Farrell efficiency e <- dea(X,Y) e peers(e) # Farrell potential saving in inputs, excess inputs (1-eff(e)) * X
Data colected from Danish milk producers.
data(milkProd)
data(milkProd)
A data frame with 108 observations on the following 5 variables.
farmNo
farm number
milk
Output of milk, kg
energy
Energy expenses
vet
Veterinary expenses
cows
Number of cows
Data as .csv
are loaded by the command data
using
read.table(..., header = TRUE, sep = ";")
such that this file
is a semicolon separated file and not a comma separated file.
Accounting and business check data
data(milkProd) y <- with(milkProd, cbind(milk)) x <- with(milkProd, cbind(energy, vet, cows))
data(milkProd) y <- with(milkProd, cbind(milk)) x <- with(milkProd, cbind(energy, vet, cows))
A data set for 113 farmers in forestry in Norway.
data(norWood2004)
data(norWood2004)
A data frame with 113 observations on the following 7 variables.
firm
firm number
m
Variable cost
x
Woodland, value of forest and land
y
Profit
z1
Secondary income from ordinary farming
z3
Age of forest owner
z6
Whether there is a long-term plan =1 or not =0
Collected from farmers in forestry.
Data as .csv
are loaded by the command data
using
read.table(..., header=TRUE, sep=";")
such that this file
is a semicolon separated file and not a comma separated file.
Norwegian Agricultural Economics Research Institute.
data(norWood2004) ## maybe str(norWood2004) ; plot(norWood2004) ...
data(norWood2004) ## maybe str(norWood2004) ; plot(norWood2004) ...
The functions implements the Wilson (1993) outlier detection method. One written entirely in R and another written in C++.
outlier.ap (X, Y, NDEL = 3, NLEN = 25, TRANSPOSE = FALSE) outlierC.ap(X, Y, NDEL = 3, NLEN = 25, TRANSPOSE = FALSE) outlier.ap.plot(ratio, NLEN = 25, xlab = "Number of firms deleted", ylab = "Log ratio", ..., ylim)
outlier.ap (X, Y, NDEL = 3, NLEN = 25, TRANSPOSE = FALSE) outlierC.ap(X, Y, NDEL = 3, NLEN = 25, TRANSPOSE = FALSE) outlier.ap.plot(ratio, NLEN = 25, xlab = "Number of firms deleted", ylab = "Log ratio", ..., ylim)
X |
Input as a firms times goods matrix, see |
Y |
Output as a firms times goods matrix, see
|
NDEL |
The maximum number of firms to be considered as a group of outliers, i.e. the maximum number of firms to be deleted. |
NLEN |
The number of ratios to save for each level
or removal, the number of rows in |
TRANSPOSE |
Input and output matrices are treated as firms
times goods matrices for the default value |
ratio |
The |
xlab |
Label for the x-axis. |
ylab |
Label for the y-axis |
ylim |
The y limits |
... |
Usual options for the methods |
An implementation of the method in Wilson (1993) using only R
functions and especially the function det
to calculate
. The alternative method
outlierC.ap
is written completely in C++ and is much faster, but still not
as fast at the method in FEAR.
An elementary presentation of the method is found in Bogetoft and Otto (2011), Sect. 5.13 on outliers.
For a data set with 10 firms and considering at the most 3 outliers there are
175 combinations of firms to delete. For 100 firms there are 166,750
combinations and for at most 5 outliers there are 79,375,495 combinatins, for
at most 8 outliers there are 203,366,882,995 combinations. For 200 firms whith
respectively 3,5 and 8 outliers there are 1,333,500, and 2,601,668,490, and
a number we do not know what to call
57,467,902,686,615 combinations. Thus the number of combinations are increasing
exponentialy in both number of firms and number of firms to be deleted and so
is the computational time. Thus you should limit the numbers NDEL
to a
very small number like at the most 3 or perhabs 5 depending of the number
of firms. Or you should use the extremely fast method ap
from the
package FEAR mentioned in the references.
ratio |
A |
imat |
A |
r0 |
A |
The function outlier.ap
is extremely slow and for NDEL
larger than 3 or 4 it might be advisable to use the function ap
from the package FEAR.
The name of the returned components are the same as for ap
in
the package FEAR.
Peter Bogetoft and Lars Otto [email protected]
Bogetoft and Otto; Benchmarking with DEA, SFA, and R; Springer 2011
Wilson (1993), “Detecing outliers in deterministic nonparametric frontier models with multiple outputs,” Journal of Business and Economic Statistics 11, 319-323.
Wilson (2008), “FEAR 1.0: A Software Package for Frontier Efficiency Analysis with R,” Socio-Economic Planning Sciences 42, 247–254
The function ap
in the package FEAR.
n <- 25 x <- matrix(rnorm(n)) y <- .5 + 2.5*x + 2*rnorm(25) tap <- outlier.ap(x,y, NDEL=2) print(cbind(tap$imat,tap$rmin), na.print="", digit=2) outlier.ap.plot(tap$ratio)
n <- 25 x <- matrix(rnorm(n)) y <- .5 + 2.5*x + 2*rnorm(25) tap <- outlier.ap(x,y, NDEL=2) print(cbind(tap$imat,tap$rmin), na.print="", digit=2) outlier.ap.plot(tap$ratio)
The function peers
finds for each firm its peers,
get.number.peers
finds for each peer the number of times
this peer apears as a peer, and get.which.peers
determines
for one or more peers the firms they appear as peers for. Also
include a function get.peers.lambda
to calculate for firms the
importance (lambdas) of peers.
peers(object, NAMES = FALSE, N=1:dim(object$lambda)[1], LAMBDA=0) get.number.peers(object, NAMES = FALSE, N=1:dim(object$lambda)[2], LAMBDA=0) get.which.peers(object, N = 1:dim(object$lambda)[2], LAMBDA=0) get.peers.lambda(object, N=1:dim(object$lambda)[1], LAMBDA=0)
peers(object, NAMES = FALSE, N=1:dim(object$lambda)[1], LAMBDA=0) get.number.peers(object, NAMES = FALSE, N=1:dim(object$lambda)[2], LAMBDA=0) get.which.peers(object, N = 1:dim(object$lambda)[2], LAMBDA=0) get.peers.lambda(object, N=1:dim(object$lambda)[1], LAMBDA=0)
object |
An object of class Farrell as returned by the functions
|
NAMES |
If true then names for the peers are returned if names
are available otherwise the unit index numbers are used. If |
N |
The firm(s) or peer(s) for which to get the results. |
LAMBDA |
Minimum weight for extracted peers, i.e. the extracted peers have
lambda values larger than |
The returned values are index of the firms and can be used by itself, but can also by used as an index for a variable with names of the firms.
The peers
returns a matrix with numbers for the peers for
each firm; for firms with efficiency 1 the peers are just the firm
itself. If there is slack in the evaluation of a firm with
efficiency 1, this can be found with a call to slack
,
either directly or by the argument SLACK
when a function
dea
was called to generate the Farrell object
.
The get.number.peers
returns the number of firms that a peer
serves as a peer for.
The get.peers.lambda
returns a list of firms with the peers
and corresponding value of lambda.
The return values are firm numbers. If the argument
NAMES=TRUE
is used in the function peers
the return
is a list of names of the peers if names for the firms are available
as row names.
Peers are defined as firms where the corresponding lambdas are positive.
Note that peers might change between a Farrell object return from
dea
with SLACK=FALSE
and a call with SLACK=TRUE
or a following call to the function slack
because a peer on
the frontier with slack might by the call to dea
be a
peer for itself whereas this will not happen when slacks are
calculated.
Peter Bogetoft and Lars Otto [email protected]
Peter Bogetoft and Lars Otto; Benchmarking with DEA, SFA, and R; Springer 2011. Sect. 4.6 page 93
x <- matrix(c(100,200,300,500,100,200,600),ncol=1) y <- matrix(c(75,100,300,400,25,50,400),ncol=1) e <- dea(x,y) peers(e) get.number.peers(e) # Who are the firms that firm 1 and 4 is peers for get.which.peers(e, c(1,4))
x <- matrix(c(100,200,300,500,100,200,600),ncol=1) y <- matrix(c(75,100,300,400,25,50,400),ncol=1) e <- dea(x,y) peers(e) get.number.peers(e) # Who are the firms that firm 1 and 4 is peers for get.which.peers(e, c(1,4))
Input and output data for 248 pig producers that also produces crop, i.e. a multi–output data set.
data(pigdata)
data(pigdata)
A data frame with 248 observations on the following 16 variables.
firm
Serial number for pig producer
x1
Input fertilizer
x2
Input feedstuf
x3
Input land
x4
Input labour
x5
Input machinery
x6
Input other capital
y2
Output crop
y4
Output pig
w1
Price of fertilizer
w2
Price of feedstuf
w3
Price of land
w4
Price of labour
w5
Price of michenery
w6
Price of other capital
p2
Price of crop
p4
Price of pig
cost
Total cost, w1*x1+...+w6*x6.
rev
Total revenue, p2*y2+p4*y4.
In raising pigs, most farmers also produce crops to feed the pigs. Labor and capital are used not just directly for pig-raising but also on the field.
Data as .csv
are loaded by the command data
using
read.table(..., header = TRUE, sep = ";")
as the file
is a semicolon separated file and not a comma separated file.
Farmers accounting data converted to index.
data(pigdata) ## maybe str(pigdata) ; plot(pigdata) ...
data(pigdata) ## maybe str(pigdata) ; plot(pigdata) ...
Accounting and production data for 101 milk producing farmers.
data(projekt)
data(projekt)
A data frame with 101 observations on the following 14 variables.
numb
Serial number for the milk producer
cows
Number of cows
vet
Veterinary expenses
unitCost
Unit cost, variable cost
capCost
Capacity cost
fixedCost
Fixed cost
milkPerCow
Milk per cow, kg
quota
Milk quota
fatPct
Fat percent in milk
protPct
Protein percent in milk
cellCount
Cell count for milk
race
Race for cows, a factor with levels jersey
,
large
, and mixed
type
Type of production, conventional or organic,
a factor with levels conv
orga
age
Age of the farmer
Data is a mix of accounting data and production controls.
Data as .csv
are loaded by the command data
using
read.table(..., header = TRUE, sep = ";")
such that this file
is a semicolon separated file and not a comma separated file.
Collected from farmers.
data(projekt) ## maybe str(projekt) ; plot(projekt) ...
data(projekt) ## maybe str(projekt) ; plot(projekt) ...
The method sdea
calculates super-efficiency and
returns the same class of object as dea
.
sdea(X, Y, RTS = "vrs", ORIENTATION = "in", DIRECT = NULL, param = NULL, TRANSPOSE = FALSE, LP = FALSE, CONTROL = NULL)
sdea(X, Y, RTS = "vrs", ORIENTATION = "in", DIRECT = NULL, param = NULL, TRANSPOSE = FALSE, LP = FALSE, CONTROL = NULL)
X |
Inputs of firms to be evaluated, a K x m matrix
of observations of K firms with m inputs (firm x input). In case
|
||||||||||||||||||||||||
Y |
Outputs of firms to be evaluated, a K x n matrix
of observations of K firms with n outputs (firm x input). In case
|
||||||||||||||||||||||||
RTS |
Text string or a number defining the underlying DEA
technology / returns to scale assumption, the same values as for
|
||||||||||||||||||||||||
ORIENTATION |
Input efficiency "in" (1), output efficiency
"out" (2), and graph efficiency "graph" (3). For use with
|
||||||||||||||||||||||||
DIRECT |
Directional efficiency, If the argument is a scalar, the direction is (1,1,...,1) times the scalar; the value of the efficiency depends on the scalar as well as on the unit of measurements. If the argument an array, this is used for the direction for
every firm; the length of the array must correspond to the
number of inputs and/or outputs depending on the
If the argument is a matrix then different directions are used
for each firm. The dimensions depends on the
|
||||||||||||||||||||||||
param |
Argument is at present only used when
|
||||||||||||||||||||||||
TRANSPOSE |
See the description in |
||||||||||||||||||||||||
LP |
Only for debugging, see the description in
|
||||||||||||||||||||||||
CONTROL |
Possible controls to lpSolveAPI, see the
documentation for that package. For examples of use see the
function |
Super-efficiency measures are constructed by avoiding that the evaluated firm can help span the technology, i.e. if the firm in qestuen is a firm on the frontier in a normal dea approach then this firm in super efficiency might be outside the technology set.
The object returned is a Farrell object with the component
described in dea
. The relevant components are
eff |
The efficiencies. Note when DIRECT is used then the efficencies are not Farrell efficiencies but rather excess values in DIRECT units of measurement. |
lambda |
The lambdas, i.e. the weight of the peers, for each Firm. |
objval |
The objective value as returned from the LP program; normally the same as eff. |
RTS |
The return to scale assumption as in the option
|
ORIENTATION |
The efficiency orientation as in the call. |
Calculation of slacks for super efficiency should be done by
using the option SLACK=TRUE
in the call of the method
sdea
. If the two phases are done in two steps as first a
call to sdea
and then a call to slacks
the user must
make sure to set the reference technology to the one corresponding
to super-efficiency in the call to slack
and this requires a
loop with calls to slack
.
Peter Bogetoft and Lars Otto [email protected]
Peter Bogetoft and Lars Otto; Benchmarking with DEA, SFA, and R; Springer 2011. Sect. 5.2 page 115
P Andersen and NC Petersen; “A procedure for ranking efficient units in data envelopment analysis”; Management Science 1993 39(10):1261–1264
x <- matrix(c(100,200,300,500,100,200,600),ncol=1) y <- matrix(c(75,100,300,400,25,50,400),ncol=1) se <- sdea(x,y) se # Leave out firm 3 as a determining firm of the technology set n <- 3 dea.plot.frontier(x[-n], y[-n], txt=(1:dim(x)[1])[-n]) # Plot and label firm 3 points(x[n],y[n],cex=1.25,pch=16) text(x[n],y[n],n,adj=c(-.75,.75))
x <- matrix(c(100,200,300,500,100,200,600),ncol=1) y <- matrix(c(75,100,300,400,25,50,400),ncol=1) se <- sdea(x,y) se # Leave out firm 3 as a determining firm of the technology set n <- 3 dea.plot.frontier(x[-n], y[-n], txt=(1:dim(x)[1])[-n]) # Plot and label firm 3 points(x[n],y[n],cex=1.25,pch=16) text(x[n],y[n],n,adj=c(-.75,.75))
Estimate a stochastic frontier production or cost function using a maximum likelihood method.
sfa(x, y, beta0 = NULL, lambda0 = 1, resfun = ebeta, TRANSPOSE = FALSE, DEBUG=FALSE, control=list(), hessian=2) sfa.cost(W, Y, COST, beta0 = NULL, lambda0 = 1, resfun = ebeta, TRANSPOSE = FALSE, DEBUG=FALSE, control=list(), hessian=2) te.sfa(object) teBC.sfa(object) teMode.sfa(object) teJ.sfa(object) te.add.sfa(object, ...) sigma2u.sfa(object) sigma2v.sfa(object) sigma2.sfa(object) lambda.sfa(object)
sfa(x, y, beta0 = NULL, lambda0 = 1, resfun = ebeta, TRANSPOSE = FALSE, DEBUG=FALSE, control=list(), hessian=2) sfa.cost(W, Y, COST, beta0 = NULL, lambda0 = 1, resfun = ebeta, TRANSPOSE = FALSE, DEBUG=FALSE, control=list(), hessian=2) te.sfa(object) teBC.sfa(object) teMode.sfa(object) teJ.sfa(object) te.add.sfa(object, ...) sigma2u.sfa(object) sigma2v.sfa(object) sigma2.sfa(object) lambda.sfa(object)
x |
Input as a K x m matrix of observations on m inputs from K firms; (firm x input); MUST be a matrix. No constant for the intercept should be included in x as it is added by default. |
y |
Output; K times 1 matrix (one output) |
Y |
Output; K times n matrix for m outputs; only to be used in cost function estimation. |
W |
Input prices as a K x m matrix. |
COST |
Cost as a K array for the K firms |
beta0 |
Optional initial parameter values |
lambda0 |
Optional initial ratio of variances |
resfun |
Function to calculate the residuals, default is a
linear model with an intercept. Must be called as
|
TRANSPOSE |
If TRUE, data is transposed, i.e. input is now m x K matrix |
DEBUG |
Set to TRUE to get various debugging information written on the console |
control |
List of control parameters to |
hessian |
How the Hessian is delivered, see the ucminf documentation |
object |
Object of class ‘sfa’ as output from the
function |
... |
Further arguments ... |
The optimization is done by the R method ucminf
from
the package with the same name. The efficiency terms are assumed to
be half–normal distributed.
Changing the maximum step length, the trust region, might be important, and this can be done by the option 'control = list(stepmax=0.1)'. The default value is 0.1 and that value is suitable for parameters around 1; for smaller parameters a lower value should be used. Notice that the step length is updated by the optimizing program and thus, must be set for every call of the function sfa if it is to be set.
The generic functions print.sfa
, summary.sfa
,
fitted.sfa
, residuals.sfa
, logLik.sfa
, and
coef.sfa
all work as expected.
The methods te.sfa
, teMode.sfa
etc. calculates the
efficiency corresponding to different methods
The values returned from sfa
is the same as for ucminf
,
i.e. a list with components plus some especially relevant for sfa:
par |
The best set of parameters found |
value |
The value of minus log-likelihood function corresponding to 'par'. |
beta |
The parameters for the function |
sigma2 |
The estimate of the total variance |
lambda |
The estimate of lambda |
N |
The number of observations |
df |
The degrees of freedom for the model |
residuals |
The residuals as a K times 1 matrix/vector,
can also be obtained by |
fitted.values |
Fitted values |
vcov |
The variance-covarians matrix for all estimated parameters incl. lambda |
convergence |
An integer code. '0' indicates successful
convergence. Some of the error codes taken from
'1' Stopped by small gradient (grtol). '2' Stopped by small step (xtol). '3' Stopped by function evaluation limit (maxeval). '4' Stopped by zero step from line search More codes are found in |
message |
A character string giving any additional information returned by the optimizer, or 'NULL'. |
o |
The object returned by |
Calculation of technical efficiencies for each unit can be done by the method te.sfa as shown in the examples.
te.sfa(sfaObject), teBC.sfa(sfaObject)
: Efficiencies estimated
by minimizing the mean square error; Eq. (7.21) in Bogetoft and
Otto (2011, 219) and Battese and Coelli (1988, 392)
teMode.sfa(sfaObject), te1.sfa(sfaObject)
: Efficiencies
estimates using the conditional mode approach; Bogetoft and Otto
(2011, 219), Jondrow et al. (1982, 235).
teJ.sfa(sfaObject), te2.sfa(sfaObject)
: Efficiencies estimates
using the conditional mean approach Jondrow et al. (1982, 235).
te.add.sfa(sfaObject)
Efficiency in the additive model,
Bogetoft and Otto (2011, 219)
The variance pf the distribution of efficiency can be calculated by
sigma2u.sfa(sfaObject)
, the variance of the random
error by sigma2v.sfa(sfaObject)
, and the total variance
(sum of variances of efficiency and random noise) by
sigma2.sfa
.
The ratio of variances of the efficiency and the random noise can be
found from the method lambda.sfa
The generic method summary
prints the parameters, standard
errors, t-values, and a few more statistics from the
optimization.
Peter Bogetoft and Lars Otto [email protected]
Bogetoft and Otto; Benchmarking with DEA, SFA, and R, Springer 2011; chapters 7 and 8.
See the method ucminf
for the possible optimization
methods and further options to use in the option control
.
The method sfa
in the package frontier gives another
way to estimate stochastic production functions.
# Example from the book by Coelli et al. # d <- read.csv("c:/0work/rpack/front41Data.csv", header = TRUE, sep = ",") # x <- cbind(log(d$capital), log(d$labour)) # y <- matrix(log(d$output)) n <- 50 x1 <- 1:50 + rnorm(n, 0, 10) x2 <- 100 + rnorm(n, 0, 10) x <- cbind(x1, x2) y <- 0.5 + 1.5*x1 + 2*x2 + rnorm(n, 0, 1) - pmax(0, rnorm(n, 0, 1)) sfa(x,y) summary(sfa(x,y)) # Estimate efficiency for each unit o <- sfa(x,y) eff(o) te <- te.sfa(o) teM <- teMode.sfa(o) teJ <- teJ.sfa(o) cbind(eff(o),te,Mode=eff(o, type="Mode"),teM,teJ)[1:10,] sigma2.sfa(o) # Estimated varians lambda.sfa(o) # Estimated lambda
# Example from the book by Coelli et al. # d <- read.csv("c:/0work/rpack/front41Data.csv", header = TRUE, sep = ",") # x <- cbind(log(d$capital), log(d$labour)) # y <- matrix(log(d$output)) n <- 50 x1 <- 1:50 + rnorm(n, 0, 10) x2 <- 100 + rnorm(n, 0, 10) x <- cbind(x1, x2) y <- 0.5 + 1.5*x1 + 2*x2 + rnorm(n, 0, 1) - pmax(0, rnorm(n, 0, 1)) sfa(x,y) summary(sfa(x,y)) # Estimate efficiency for each unit o <- sfa(x,y) eff(o) te <- te.sfa(o) teM <- teMode.sfa(o) teJ <- teJ.sfa(o) cbind(eff(o),te,Mode=eff(o, type="Mode"),teM,teJ)[1:10,] sigma2.sfa(o) # Estimated varians lambda.sfa(o) # Estimated lambda
Slacks are calculated after taking the efficiency into consideration.
slack(X, Y, e, XREF = NULL, YREF = NULL, FRONT.IDX = NULL, LP = FALSE, CONTROL=NULL)
slack(X, Y, e, XREF = NULL, YREF = NULL, FRONT.IDX = NULL, LP = FALSE, CONTROL=NULL)
X |
Inputs of firms to be evaluated, a K x m matrix of observations of K firms with m inputs (firm x input). |
Y |
Outputs of firms to be evaluated, a K x n matrix of observations of K firms with n outputs (firm x input). |
e |
A Farrell object as returned from |
XREF |
Inputs of the firms determining the technology, defaults
to |
YREF |
Outputs of the firms determining the technology,
defaults to |
FRONT.IDX |
Index for firms determining the technology |
LP |
Set |
CONTROL |
Possible controls to lpSolveAPI, see the
documentation for that package. For examples of use see the
function |
Slacks are calculated in a LP problem where the sum of all slacks are maximised after correction for efficiency. The for calculating slacks for orientation graph is low because of the low precision in the calculated graph efficiency.
The result is returned as the Farrell object used as the argument in the call of the function with the following added components:
slack |
A logical vector where the component for a firm is
|
sum |
A vector with sums of the slacks for each firm. Only
calculated in dea when option |
sx |
A matrix for input slacks for each firm, only calculated
if the option |
sy |
A matrix for output slack, see |
If a numerical problem occurs, status=5, or if no solution can be found,
the best solution is often to scale the input X
and output
Y
yourself or use the option CONTROL
to change scaling in
the program itself, as described in the notes for dea
.
Peter Bogetoft and Lars Otto [email protected]
Peter Bogetoft and Lars Otto; Benchmarking with DEA, SFA, and R; Springer 2011. Sect. 5.6 page 127.
WW Cooper, LM Seiford, and K Tone; Data Envelopment Analysis: A Comprehensive Text with Models, Applications, References and DEA-Solver Software, 2nd edn. Springer 2007 .
x <- matrix(c(100,200,300,500,100,200,600),ncol=1) y <- matrix(c(75,100,300,400,25,50,400),ncol=1) dea.plot.frontier(x,y,txt=1:dim(x)[1]) e <- dea(x,y) eff(e) # calculate slacks sl <- slack(x,y,e) data.frame(e$eff,sl$slack,sl$sx,sl$sy)
x <- matrix(c(100,200,300,500,100,200,600),ncol=1) y <- matrix(c(75,100,300,400,25,50,400),ncol=1) dea.plot.frontier(x,y,txt=1:dim(x)[1]) e <- dea(x,y) eff(e) # calculate slacks sl <- slack(x,y,e) data.frame(e$eff,sl$slack,sl$sx,sl$sy)
Convex nonparametric least squares here for convex (Cost) function function or concave (Production) function with multiplicative or additive error term. the StoNED estimator combines the axiomatic and non-parametric frontier (the DEA aspect) with a stochastic noise term (the SFA aspect)
stoned(X, Y, RTS = "vrs", COST = 0, MULT = 0, METHOD = "MM")
stoned(X, Y, RTS = "vrs", COST = 0, MULT = 0, METHOD = "MM")
X |
Inputs (right hand side) of firms to be evaluated, a K x m matrix of observations of K firms with m inputs (firm x input). |
Y |
Output or cost (left hand side) of firms to be evaluated, a K x 1 matrix of observations of K firms with 1 output or cost (firm x input). |
RTS |
RTS determines returns to scale assumption: RTS="vrs",
"drs", "crs" and "irs" are possible for constant or variable returns
to scale; see |
COST |
COST specifies whether a cost function needs is estimated (COST=1) or a production function (COST=0). |
MULT |
MULT determines if multiplicative (MULT=1) or additive (MULT=0) model is estimated. |
METHOD |
METHOD specifies the way efficiency is estimated: MM for Method of Moments and PSL for pseudo likelihood estimation. |
Convex nonparametric least squares here for convex (cost) function with multiplicative error term: Y=b*X*exp(e) or additive error term: Y=b*X + e.
The results are returned in a list with the components:
residualNorm |
Norm of residual |
solutionNorm |
Norm of solution |
error |
Is there an error in the solution? |
coef |
beta_matrix, estimated coefficients as a Kxm matrix; if there is an intercept the first column is the intercept, and the matrix is Kx(1+m) |
residuals |
Residuals |
fit |
Fitted values |
eff |
Efficinecy score |
front |
Points on the frontier |
sigma_u |
sigma_u |
Convex nonparametric least squares here for convex (Cost) function
with multiplicative error term:
Y=b*X*exp(e)
or additive error term: Y=b*X + e
.
The intercept is absent for the constant returns to scale assumption; all other technology assumptions do have an intercept.
Note that the method stoned
is a rather slow method and probably only
works in a reasonable time for less than 3-400 units.
Stefan Seifert [email protected] and Lars Otto [email protected]
Kuosmanen and Kortelainen, "Stochastic non-smooth envelopment of data: semi-parametric frontier estimation subject to shape constraints", Journal of Productivity Analysis 2012
#### Example: Single Input Production Function n=10 x1 <- runif(n,10,20) v <- rnorm(n,0,0.01) u <- abs(rnorm(n,0,0.04)) y <- (x1^0.8)*exp(-u)*exp(v) sol_MM <- stoned(x1, y) sol_PSL <- stoned(x1, y, METHOD="PSL") plot(x1,y) curve(x^0.8, add=TRUE) points(x1,sol_MM$front, col="red") points(x1,sol_PSL$front, col="blue", pch=16, cex=.6)
#### Example: Single Input Production Function n=10 x1 <- runif(n,10,20) v <- rnorm(n,0,0.01) u <- abs(rnorm(n,0,0.04)) y <- (x1^0.8)*exp(-u)*exp(v) sol_MM <- stoned(x1, y) sol_PSL <- stoned(x1, y, METHOD="PSL") plot(x1,y) curve(x^0.8, add=TRUE) points(x1,sol_MM$front, col="red") points(x1,sol_PSL$front, col="blue", pch=16, cex=.6)
Calculates the probability of a type I error for a test in bootstrapped DEA models.
typeIerror(shat,s)
typeIerror(shat,s)
shat |
The value of the statistic for which the probability of a type I error is to be calculated |
s |
Vector with calculated values of the statistic for each of
the |
Needs bootstrapped values of the test statistic
Returns the probability of a type I error
Peter Bogetoft and Lars Otto [email protected]
boot.sw98
in FEAR, Paul W. Wilson (2008),
“FEAR 1.0: A Software Package for Frontier Efficiency
Analysis with R,” Socio-Economic Planning Sciences 42,
247–254
# Probability of getting something larger than 1.96 in 10000 random # standard normal variates. x <- rnorm(10000) typeIerror(1.96,x)
# Probability of getting something larger than 1.96 in 10000 random # standard normal variates. x <- rnorm(10000) typeIerror(1.96,x)