% Created 2016-11-11 Fri 08:35 \documentclass[11pt]{article} \usepackage[latin1]{inputenc} \usepackage[T1]{fontenc} \usepackage{fixltx2e} \usepackage{graphicx} \usepackage{longtable} \usepackage{float} \usepackage{wrapfig} \usepackage{soul} \usepackage{textcomp} \usepackage{marvosym} \usepackage{wasysym} \usepackage{latexsym} \usepackage{amssymb} \usepackage{hyperref} \tolerance=1000 \usepackage{amsmath} \usepackage{algpseudocode} \usepackage[margin=0.75in]{geometry} \usepackage{nth} \usepackage{tikz} \usetikzlibrary{positioning,chains,fit,shapes,calc} \usepackage{ccicons} \usepackage{fancyhdr} \usepackage{hyperref} \fancyhead{}\fancyfoot[C]{\tiny This work is licensed under a \href{http://creativecommons.org/licenses/by/4.0/}{Creative Commons Attribution 4.0 International License. \ccby}\\ For license purposes, the author is the University of British Columbia.}\pagestyle{fancyplain} %\fancyhead{}\fancyfoot[C]{\tiny This work is licensed under a \href{http://creativecommons.org/licenses/by-nc/4.0/}{Creative Commons Attribution-NonCommercial 4.0 International License. \ccbync}\\ For license purposes, the author is the University of British Columbia.}\pagestyle{fancyplain} \providecommand{\alert}[1]{\textbf{#1}} \title{CPSC 320 2016W1: Assignment 5} \author{} \date{\today} \hypersetup{ pdfkeywords={}, pdfsubject={}, pdfcreator={Emacs Org-mode version 7.9.3f}} \begin{document} \maketitle Submit this assignment via handin (see the \href{http://blogs.ubc.ca/cpsc320/syllabus/#assignments}{syllabus} for more information) to the target \texttt{assn5} by the deadline \textbf{Thursday 24 Nov at 10PM}. For credit, your group must make a \textbf{single} submission via one group member's account. Your group's submission \textbf{must}: \begin{itemize} \item Be on time. \item Consist of a single, clearly legible PDF file named \texttt{solution.pdf} with clearly indicated solutions to the problems. (Directly produced via \LaTeX{}, Word, Google Docs, or other editing software is best. Scanned is fine. \textbf{High-quality} photographs are OK if we agree they're legible.) \item Include prominent numbering that corresponds to the numbering used in this assignment handout (not the individual quiz postings). Put these \textbf{in order}, ideally. If not, \textbf{very clearly} and prominently indicate which problem is answered where! \item Include at the start of the document the names and ugrad.cs.ubc.ca account IDs of each member of your team. \item Include at the start of the document the statement: ``All group members have read and followed the \href{http://blogs.ubc.ca/cpsc320/syllabus/#conduct}{guidelines for academic conduct in CPSC 320}. As part of those rules, when collaborating with anyone outside my group, (1) I and my collaborators took no record but names away, and (2) after a suitable break, my group created the assignment I am submitting without help from anyone other than the course staff.'' (Go read those guidelines!) \item Include at the start of the document acknowledgment of collaborators and references (with the exceptions listed in the conduct guidelines). \end{itemize} \section{Lowest-Cost (Not So) Simple Path} \label{sec-1} %\ Mon 9AM %\clearpage Imagine a weighted, directed graph $G$ where edge weights may be positive, negative, or zero. We will consider the problem of finding the lowest-cost simple path between a source node $s$ and terminal node $t$ in such a graph. We'll call this problem GENSHORT for ``general shortest path''. (Recall that a \textbf{simple} path is a path with no vertex repeated, i.e., with no cycles.) (Recall that the Bellman-Ford Algorithm---as presented in our text---finds the shortest path from any start vertex in the graph to a single terminal vertex $t$. It proceeds using dynamic programming using a table parameterized by which node is being considered as $s$ and the maximum number of edges in the path from $s$ to $t$. The first column (where the maximum number of edges is 0) has $\infty$ for all nodes except $t$ itself and 0 for $t$. On each iteration, it updates each row $s$ in the next column based on the lowest-cost path of all those that go from $s$ to some node $u$ (in one edge) and then from $u$ to $t$ using the already-computed value in the previous column.) \begin{enumerate} \item \textbf{Very briefly} explain why the Bellman-Ford algorithm cannot in general be used to solve GENSHORT. %\vfill \item Give a small instance of GENSHORT on which the Bellman-Ford algorithm \textbf{will} find the lowest-cost simple path from $s$ to $t$. Be sure to indicate what that lowest-cost simple path is. %\vfill %\vfill %\clearpage \item Here is a proposed reduction from GENSHORT to the problem of finding the lowest-cost simple path between a source node $s$ and terminal node $t$ in a weighted, directed graph with \textbf{only non-negative edge weights}: \textbf{Reduction:} Given the graph $G$ that may contain negative edge weights, find the edge with minimum weight $w_{min}$ (by scanning through all edges) and subtract $w_{min}$ from the weight of every edge to create graph $G'$. In $G'$ the minimum weight edge has weight 0, and no edge has negative weight. Find the lowest-cost simple path between $s$ and $t$ in $G'$ (i.e., call on the solution to the underlying problem), and then return this list of vertices as the lowest-cost simple path in the original graph. (Of course, the edges connecting the vertices have different weights in $G$, but it's still the same path.) Give a small instance of GENSHORT on which this reduction does \textbf{not} produce the optimal solution. Indicate the solution produced by the reduction and the optimal solution. %\vfill \end{enumerate} %\clearpage \subsection{NP-Completeness} \label{sec-1-1} In this part, we will consider a decision-variant of GENSHORT. In this variant, we add a number $k$ to the format of an instance. The solution to the instance is YES if a simple path from $s$ to $t$ exists with cost less than or equal to $k$; otherwise, the solution is NO. \begin{enumerate} \item Prove---by reducing from the HAMPATH problem to GENSHORT---that GENSHORT is NP-hard. (Note: HAMPATH is NP-complete.) \emph{Hint:} it may help to add a couple of nodes to be $s$ and $t$. When thinking about edges to and from those nodes, consider that you can have zero-weight edges. \item Prove that the decision version of GENSHORT is in NP by showing it is ``efficiently certifiable''. First, select a certificate. (Think of how you would describe the solution to the \textbf{original} version of GENSHORT.) Then, show how to prove in time polynomial in the size of the decision-variant GENSHORT instance that the answer to the decision problem is YES given such a certificate. (A decision-variant GENSHORT instance is a graph plus one extra number; think of its size as $O(n + m)$ as usual for graphs.) \end{enumerate} (This isn't required, but you might want to work through how you could solve the original variant of GENSHORT using a polynomial number of calls to the decision-variant.) %\clearpage \section{Seam Carving} \label{sec-2} %\ Mon 3PM %\clearpage You can resize an image by scaling or cropping it, but what if the pieces of the image that you want are not all in one rectangular area, and you don't want to make those parts of the image smaller by scaling?\footnote{This method was \href{http://dl.acm.org/citation.cfm?id=1276390}{developed by Avidan and Shamir}. } In that case, you might instead choose to eliminate one pixel from each row (to make the image one pixel narrower) or one pixel from each column (to make the image shorter) while somehow optimizing for the ``best'' pixels to remove. In this problem, we focus on removing one pixel from each row. We'll assume an image is an $n$ column by $m$ row array of pixels $A[1\ldots n][1\ldots m]$, where each pixel is an ``energy'' rather than a color. Energies are non-negative numbers representing the importance of the pixel. A legal seam must include one pixel from every row. Each pair of seam pixels in neighbouring rows must be either in the same column or one column apart (i.e., on a diagonal). The cost of a seam is the total energy of all the pixels in the seam. The best seam is the one with lowest cost. So, a seam of pixels to remove often looks a little like a ``lightning bolt'' moving down, down-and-left, and down-and-right from the top to the bottom of the image, such as this: \begin{tikzpicture}[thick,scale=0.50] \draw (0, 0) rectangle (8, 8); \draw (5, 8) -- (5, 7) -- (4, 6) -- (5, 5) -- (4, 4) -- (3, 3) -- (3, 2) -- (2, 1) -- (3, 0); \end{tikzpicture} \begin{enumerate} \item Circle two non-overlapping seams in this diagram that have different costs. Indicate their costs and which seam is better: \begin{verbatim} 1 8 7 5 6 2 4 9 5 1 2 8 8 7 6 6 2 1 9 5 4 \end{verbatim} \item Give a recurrence for the cost of the best partial seam that has pixels only from row $0$ up to $i$ and ends at the pixel in row $i$ and column $j$. Your recurrence should be in terms of the seams ending at pixels in the row above, row $i-1$. Assume $i > 0$. %\vfill \begin{verbatim} C(i, j) = \end{verbatim} %\vfill \item Give the cost for a partial seam that only has a pixel in the very first row, $i = 0$. (This is our base case.) %\vfill \begin{verbatim} C(0, j) = \end{verbatim} %\vfill \end{enumerate} %\clearpage \subsection{Dynamic Programming} \label{sec-2-1} \begin{enumerate} \item Give a pseudocode algorithm that finds the cost of the best seam in an energy array using dynamic programming. \item Give a pseudocode algorithm that takes the dynamic programming table and produces the column numbers of the pixels in the best seam. (So, if the best seam has a pixel at column 3 in the first row and column 4 in the second, then your solution should give the list $[3, 4]$.) \end{enumerate} \subsection{Bonus: Implement Awesomely} \label{sec-2-2} Look up seam carving and design an implementation. Maybe extend it to apply to video! %\clearpage \section{Transformers} \label{sec-3} %\ Tue 1PM %\clearpage In the ELEC problem, you're given a network of electrical wires which can be represented as a directed, acyclic graph (DAG) with three types of nodes: \begin{itemize} \item ``Switch'' nodes supply power. They have \textbf{no} wires coming in and two wires going out labeled ``up'' and ``down''. They also have a switch. If the switch is in the up position, then power (electricity) flows into the up wire. If the switch is in the down position, then power flows into the down wire. \item ``Branch'' nodes can have one wire coming in (which may or may not carry power) and any number of wires going out. If the wire coming in carries power, then all wires going out also carry power. Otherwise, none of the wires carries power. \item ``Load'' nodes represent electrical devices that must be powered. They have one or more wires coming in and none going out. If any wire coming in carries power, the load is powered. Otherwise, it is not. \end{itemize} The solution to an ELEC instance is YES if some configuration of the switches powers all the loads; otherwise, it's NO. \begin{enumerate} \item Indicate a configuration of the switches in the following network that powers all the loads by writing ``up'' or ``down'' on each switch node. (Switch nodes are labeled S, branch nodes B, and load nodes~L.) \includegraphics[width=0.3\textwidth]{figures/2016W1-a5-elec-network.png} \item Give a reduction from SAT to ELEC. \emph{Hint:} Consider that a variable can be positive or negated, the positive (or negated) literal can appear in many clauses, and each clause needs at least one true literal in it. %\clearpage \end{enumerate} \subsection{NP-Completeness} \label{sec-3-1} You have already shown that ELEC is NP-hard with your reduction from SAT to ELEC above. Now, you will show that ELEC is in NP. (Together, the fact that it is NP-hard and in NP makes it NP-complete.) \begin{enumerate} \item Give a (polynomial-length) certificate for ELEC instances where all loads can be powered. \emph{Hint:} We already asked you for your ``solution'' to an ELEC problem above. What form does such a solution take? \item Give an algorithm that takes polynomial time in the size of an ELEC instance to determine whether all loads can be powered in that instance, given a certificate like the the one you describe above. \end{enumerate} \section{The Benefits of Preparallelation?} \label{sec-4} %\ Wed 12PM %\clearpage You're setting up a process to run in parallel on a huge array of processors.\footnote{Skippable note: ``Huge'' means big enough for asymptotics to matter. For example, the \href{https://www.top500.org/resources/top-systems/sunway-taihulight-national-supercomputing-center-i/}{Sunway TaihuLight supercomputer in China} has over 10 million cores, which is 3 times as many as last year's biggest. } The process is a chain of $n \geq 1$ computations $c_1, \ldots, c_n$ connected by $n-1$ operations $\circ_1, \ldots, \circ_{n-1}$ of the form $(c_1 \circ_1 c_2 \circ_2 \ldots \circ_{n-2} c_{n-1} \circ_{n-1} c_n)$. You receive an array $C[1\ldots n]$ of costs of each computation $c_i$ and an array $D[1\ldots n-1]$ of the cost of each operator $\circ_i$. The process runs on $n$ processors, each running a single operation $c_i$ in time $C[i]$. Then, one of two processors with results from neighbouring computations collects results from the other processor and combines them in time $D[j]$ using the appropriate operator $\circ_j$. The way the process is split across processors and then combined by $\circ$ operators forms a binary tree. For example, with $n=5$ processes with costs $C = [4, 6, 8, 10, 2]$ and $D = [5, 3, 1, 7]$, we might combine the results in the fashion described by one of these two trees: \includegraphics[width=0.3\textwidth]{figures/2016W1-a5-parallel-tree-1.png} \hfill \includegraphics[width=0.4\textwidth]{figures/2016W1-a5-parallel-tree-2.png} The left-hand tree combines the results from $c_1$ with the results from $c_2$ first, adds in $c_3$'s results, then $c_4$'s, and then $c_5$'s. The right-hand combines the result from $c_1$ with the combination of $c_2$'s and $c_3$'s results, and combines that result with the combination of $c_4$'s and $c_5$'s results. The overall cost of a way to split up the process (i.e., a tree) is the maximum cost path from the root to a leaf (because that is the longest-running series of non-parallelizable steps). The left tree's cost is $7 + 1 + 3 + 5 + 6 = 22$. The right tree's cost is lower: $1 + 7 + 10 = 18$. So, the right tree is a better way to split up the process. %\clearpage \begin{enumerate} \item Give a pseudocode algorithm that---given a binary tree like the diagrams above (i.e., a pointer to its root)---computes the overall cost of that tree in linear time. %\vfill %\vfill \item Here is a proposed greedy algorithm to choose the best binary tree (i.e., way to split up the process): \textbf{Algorithm:} If only a single computation remains, produce a single node for that computation. Otherwise, choose the least expensive operation $o_i$, make it the root of the tree, and produce its left and right subtrees by recursively applying the greedy algorithm to the parts of the process to the left and right of $o_i$. (For example, it would start by splitting the sample problem above into $(c_1 \circ_1 c_2 \circ_2 c_3)$ and $(c_4 \circ_4 c_5)$ because $\circ_3$ is the lowest-cost operation.) Give the smallest possible counterexample to the optimality of this algorithm. All operations $\circ_i$ in your counterexample must have distinct cost (so that ``least expensive operation'' is unique). %\vfill %\vfill \end{enumerate} %\clearpage \subsection{A Dynamic Programming Approach} \label{sec-4-1} \begin{enumerate} \item Here is a portion of a tree representing a way to split a process with $n=15$ computations up, with three boxes where the tree has not been finished. \includegraphics[width=0.4\textwidth]{figures/2016W1-a5-parallel-tree-boxes-empty.png} Fill in the boxes above with the computations still left to perform. Note: we do \textbf{not} want subtrees but computations like $(c_1 \circ_1 c_2 \circ_2 c_3 \circ_3 \ldots \circ_{n-2} c_{n-1} \circ_{n-1} c_n)$. \item Give a succinct (short and clear) way to specify which computations go in each box. \item Give a recurrence to describe the cost of the optimal way to split up the process. (Note that your algorithm for computing the overall cost of a tree also describes the cost of a particular split choice, while your succinct way to specify what goes in the boxes parameterizes the problem.) \item Write a dynamic programming solution to the problem of finding the cost of the optimal way to split up a process. \item Write a function that takes the table from your dynamic programming solution and produces the binary tree representing the split. (Each node in the tree should be labeled with the index of its operator---for internal nodes---or computation---for leaf nodes.) \end{enumerate} %\clearpage \end{document}