public final class Runner extends Object implements Callable<Boolean>, Closeable
This is the runnable class which starts the Serving Layer and its Tomcat-based HTTP server. It is
started with call() and can be shut down with close(). This implementation is used
both in stand-alone local mode, and in a distributed mode cooperating with a Computation Layer.
This program instantiates a Tomcat-based HTTP server exposing a REST-style API. It is available via
HTTP, or HTTPS as well if RunnerConfiguration.getKeystoreFile() is set. It can also be password
protected by setting RunnerConfiguration.getUserName() and
RunnerConfiguration.getPassword().
Runner is configured by RunnerConfiguration but when run as a command-line program,
it is configured via a set of analogous flags:
--localInputDir: Optional. The local directory used for reading input, writing output, and storing
user input and model files in local mode. It is used for staging input for upload in distributed mode.
Defaults to the system temp directory.--bucket: Identifies the root directory of storage under which data is stored and computation takes
place in distributed mode. Only applicable in distributed mode. Must be set with --instanceID.--instanceID: Uniquely identifies the recommender from others that may be run by the same
organization. Only applicable in distributed mode. Must be set with --bucket.--port: Port on which to listen for HTTP requests. Defaults to 80. Note that the server must be run
as the root user to access port 80.--securePort: Port on which to listen for HTTPS requests. Defaults to 443. Likewise note that
using port 443 requires running as root.--contextPath: URI base for endpoint URIs; defauls to none / the root context. Not recommended,
but if set too "foo", will cause the recommend method endpoint, for example, to be accessed at
/foo/recommend instead of /recommend--readOnly: If set, disables methods and endpoints that add, remove or change data--keystoreFile: File containing the SSL key to use for HTTPS. Setting this flag
enables HTTPS connections, and so requires that option --keystorePassword be set. In distributed
mode, if not set, will attempt to load a keystore file from the distributed file system,
at sys/keystore.ks--keystorePassword: Password for keystoreFile. Setting this flag enables HTTPS connections.--userName: If specified, the user name required to authenticate to the server using
HTTP DIGEST authentication. Requires password to be set.--password: Password for HTTP DIGEST authentication. Requires userName to be set.--consoleOnlyPassword: Only apply username and password to admin / console pages.--hostRequestLimit: max number of requests per minute from a host before it is temporarily blocked
This provides only a basic attempt to deny requests and is not guaranteed to block any DoS attack.--rescorerProviderClass: Optional. Name of an implementation of
RescorerProvider to use to rescore recommendations and similarities, if any. The class
must be added to the server classpath. Or, in distributed mode, if not found in the classpath, it
will be loaded from a JAR file found on the distributed file system at sys/rescorer.jar.
This may also be specified as a comma-separated list of class names, in which case all will be
applied, in the given order.--clientThreadClass: Optional. Name of an implementation of ClientThread
which is intended to be run in the Serving Layer in its own thread as an in-process "client" of
external services. This may be used to poll/pull updates from some external service and push
directly into the recommender, or perform any other service that a caller needs. The thread will
be started with the web container and closed with the web container.--allPartitions: Optional, but must be set with --partition.
Describes all partitions, when partitioning across Serving Layers
by user. Each partition may have multiple replicas. When running in distibuted mode on Amazon EC2,
may be specified as "auto", in which case it will
attempt to discover partition members dynamically, searching for instances tagged with EC2 key
"myrrix-partition" and whose value is a partition. (Port may be specified with EC2 tag "myrrix-port" if not
the default of 80, and, instances may be uniquely associated to a bucket and instance with "myrrix-bucket" and
"myrrix-instanceID" EC2 tags if needed.) Otherwise, replicas are specified as many Serving Layer
"host:port" pairs, separated by commas, like "rep1:port1,rep2:port2,...".
Finally, partitions are specified as multiple replicas separated by semicolon, like
"part1rep1:port11,part1rep2:port12;part2rep1:port21,part2rep2:port22;...". Example:
"foo:80,foo2:8080;bar:8080;baz2:80,baz3:80"--partition: Optional, but must be set with --allPartitions.
The partition (0-based) that this is Serving Layer is serving.--licenseFile: (Optional in standalone mode). location of a license file named [subject].lic,
where [subject] is the subject name authorized in the license. The license file should be valid at the
time the app is run, and contain authorization to use the amount of parallelism (max simultaneous
Hadoop workers) requested.When run in local mode, the Serving Layer instance will compute a model locally and save it as the file
model.bin.gz in the --localInputDir directory. It will be updated when the model is rebuilt.
If the file is present at startup, it will be read to restore the server state, rather than re-reading
CSV input in the directory and recomputing the model. Thus the file can be saved and restored as a
way of preserving and recalling the server's state of learning.
Example of running in local mode:
java -jar myrrix-serving-x.y.jar --port 8080
(with an example of JVM tuning flags:)
java -Xmx1g -XX:NewRatio=12 -XX:+UseParallelOldGC -jar myrrix-serving-x.y.jar --port 8080
Finally, some more advanced tuning parameters are available. These are system parameters, set with
-Dproperty=value.
model.features: The number of features used in building the underlying user-feature and
item-feature matrices. Typical values are 30-100. Defaults to
MatrixFactorizer#DEFAULT_FEATURES.model.als.iterations.convergenceThreshold: Controls when model building iterations stop.
When the average change in the scores estimated for user-item pairs falls below this threshold,
iteration stops.model.iterations.max: Caps the number of iterationsmodel.als.lambda: Controls the lambda overfitting parameter in the ALS algorithm.
Typical values are near 0.1. Do not change this, in general. Defaults to
AlternatingLeastSquares#DEFAULT_LAMBDA.model.als.alpha: Controls the alpha scaling parameter in the ALS algorithm.
Typical values are near 1 or above. Do not change this, in general. Defaults to
AlternatingLeastSquares#DEFAULT_ALPHA.| Constructor and Description |
|---|
Runner(RunnerConfiguration config)
Creates a new instance with the given configuration.
|
public Runner(RunnerConfiguration config)
public org.apache.catalina.startup.Tomcat getTomcat()
Tomcat server that is being configured and run inside this instance.public Boolean call() throws IOException
call in interface Callable<Boolean>IOExceptionpublic void await()
public void close()
close in interface Closeableclose in interface AutoCloseableCopyright © 2012-2013. All Rights Reserved.